Beyond Patches: Mining Interpretable Part-Prototypes for Explainable AI

Mahdi Alehdaghi; Rajarshi Bhattacharya; Pourya Shamsolmoali; Rafael M. O. Cruz; Eric Granger

doi:10.1609/aaai.v40i44.41052

Authors

Mahdi Alehdaghi LIVIA, ILLS, Dept. of Systems Engineering, ETS Montreal, Canada
Rajarshi Bhattacharya LIVIA, ILLS, Dept. of Systems Engineering, ETS Montreal, Canada
Pourya Shamsolmoali Dept. of Computer Science, University of York, UK
Rafael M. O. Cruz LIVIA, ILLS, Dept. of Systems Engineering, ETS Montreal, Canada
Eric Granger LIVIA, ILLS, Dept. of Systems Engineering, ETS Montreal, Canada

DOI:

https://doi.org/10.1609/aaai.v40i44.41052

Abstract

As AI systems become more capable, it is important that their decisions are understandable and aligned with human expectations. A key challenge is the lack of interpretability in deep models. Existing methods such as GradCAM generate heatmaps but provide limited conceptual insight, while prototype-based approaches offer example-based explanations but often rely on rigid region selection and lack semantic consistency. To address these limitations, we propose PCMNet, a Part-Prototypical Concept Mining Network that learns human-comprehensible prototypes from meaningful regions without extra supervision. By clustering these into concept groups and extracting concept activation vectors, PCMNet provides structured, concept-level explanations and enhances robustness under occlusion and adversarial conditions, which are both critical for building reliable and aligned AI systems. Experiments across multiple benchmarks show that PCMNet outperforms state-of-the-art methods in interpretability, stability, and robustness. This work contributes to AI alignment by enhancing transparency, controllability, and trustworthiness in modern AI systems.

Beyond Patches: Mining Interpretable Part-Prototypes for Explainable AI

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information