On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods

Kasun Amarasinghe; Kit T. Rodolfa; Sérgio Jesus; Valerie Chen; Vladimir Balayan; Pedro Saleiro; Pedro Bizarro; Ameet Talwalkar; Rayid Ghani

doi:10.1609/aaai.v38i19.30082

Authors

Kasun Amarasinghe Carnegie Mellon University, Pittsburgh, PA
Kit T. Rodolfa Stanford University, Palo Alto, CA
Sérgio Jesus Feedzai, Lisboa, Portugal
Valerie Chen Carnegie Mellon University, Pittsburgh, PA
Vladimir Balayan Feedzai, Lisboa, Portugal
Pedro Saleiro Feedzai, Lisboa, Portugal
Pedro Bizarro Feedzai, Lisboa, Portugal
Ameet Talwalkar Carnegie Mellon University, Pittsburgh, PA
Rayid Ghani Carnegie Mellon University, Pittsburgh, PA

DOI:

https://doi.org/10.1609/aaai.v38i19.30082

Keywords:

General

Abstract

Most existing evaluations of explainable machine learning (ML) methods rely on simplifying assumptions or proxies that do not reflect real-world use cases; the handful of more robust evaluations on real-world settings have shortcomings in their design, generally leading to overestimation of methods' real-world utility. In this work, we seek to address this by conducting a study that evaluates post-hoc explainable ML methods in a setting consistent with the application context and provide a template for future evaluation studies. We modify and improve a prior study on e-commerce fraud detection by relaxing the original work's simplifying assumptions that departed from the deployment context. Our study finds no evidence for the utility of the tested explainable ML methods in the context, which is a drastically different conclusion from the earlier work. This highlights how seemingly trivial experimental design choices can yield misleading conclusions about method utility. In addition, our work carries lessons about the necessity of not only evaluating explainable ML methods using tasks, data, users, and metrics grounded in the intended application context but also developing methods tailored to specific applications, moving beyond general-purpose explainable ML methods.

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information