On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning
Keywords:Accountability, Interpretability & Explainability, Safety, Robustness & Trustworthiness, (Deep) Neural Network Algorithms, Ethics -- Bias, Fairness, Transparency & Privacy
AbstractThere is a growing concern that the recent progress made in AI, especially regarding the predictive competence of deep learning models, will be undermined by a failure to properly explain their operation and outputs. In response to this disquiet, counterfactual explanations have become very popular in eXplainable AI (XAI) due to their asserted computational, psychological, and legal benefits. In contrast however, semi-factuals (which appear to be equally useful) have surprisingly received no attention. Most counterfactual methods address tabular rather than image data, partly because the non-discrete nature of images makes good counterfactuals difficult to define; indeed, generating plausible counterfactual images which lie on the data manifold is also problematic. This paper advances a novel method for generating plausible counterfactuals and semi-factuals for black-box CNN classifiers doing computer vision. The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all “exceptional” features in a test image to be “normal” from the perspective of the counterfactual class, to generate plausible counterfactual images. Two controlled experiments compare this method to others in the literature, showing that PIECE generates highly plausible counterfactuals (and the best semi-factuals) on several benchmark measures.
How to Cite
Kenny, E. M., & Keane, M. T. (2021). On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13), 11575-11585. https://doi.org/10.1609/aaai.v35i13.17377
AAAI Technical Track on Philosophy and Ethics of AI