Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Mikołaj Sacha; Bartosz Jura; Dawid Rymarczyk; Łukasz Struski; Jacek Tabor; Bartosz Zieliński

doi:10.1609/aaai.v38i19.30154

Authors

Mikołaj Sacha Faculty of Mathematics and Computer Science, Jagiellonian University Doctoral School of Exact and Natural Sciences, Jagiellonian University
Bartosz Jura Łukasiewicz Research Network – Poznań Institute of Technology Faculty of Management and Social Communication, Jagiellonian University
Dawid Rymarczyk Faculty of Mathematics and Computer Science, Jagiellonian University Doctoral School of Exact and Natural Sciences, Jagiellonian University Ardigen SA
Łukasz Struski Faculty of Mathematics and Computer Science, Jagiellonian University
Jacek Tabor Faculty of Mathematics and Computer Science, Jagiellonian University
Bartosz Zieliński Faculty of Mathematics and Computer Science, Jagiellonian University IDEAS NCBR

DOI:

https://doi.org/10.1609/aaai.v38i19.30154

Keywords:

General

Abstract

Prototypical parts-based networks are becoming increasingly popular due to their faithful self-explanations. However, their similarity maps are calculated in the penultimate network layer. Therefore, the receptive field of the prototype activation region often depends on parts of the image outside this region, which can lead to misleading interpretations. We name this undesired behavior a spatial explanation misalignment and introduce an interpretability benchmark with a set of dedicated metrics for quantifying this phenomenon. In addition, we propose a method for misalignment compensation and apply it to existing state-of-the-art models. We show the expressiveness of our benchmark and the effectiveness of the proposed compensation methodology through extensive empirical studies.

Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription