Q-SENN: Quantized Self-Explaining Neural Networks

Thomas Norrenbrock; Marco Rudolph; Bodo Rosenhahn

doi:10.1609/aaai.v38i19.30145

Authors

Thomas Norrenbrock Leibniz University Hannover Institute for Information Processing (tnt) L3S
Marco Rudolph Leibniz University Hannover Institute for Information Processing (tnt) L3S
Bodo Rosenhahn Leibniz University Hannover Institute for Information Processing (tnt) L3S

DOI:

https://doi.org/10.1609/aaai.v38i19.30145

Keywords:

General

Abstract

Explanations in Computer Vision are often desired, but most Deep Neural Networks can only provide saliency maps with questionable faithfulness. Self-Explaining Neural Networks (SENN) extract interpretable concepts with fidelity, diversity, and grounding to combine them linearly for decision-making. While they can explain what was recognized, initial realizations lack accuracy and general applicability. We propose the Quantized-Self-Explaining Neural Network “Q-SENN”. Q-SENN satisfies or exceeds the desiderata of SENN while being applicable to more complex datasets and maintaining most or all of the accuracy of an uninterpretable baseline model, outperforming previous work in all considered metrics. Q-SENN describes the relationship between every class and feature as either positive, negative or neutral instead of an arbitrary number of possible relations, enforcing more binary human-friendly features. Since every class is assigned just 5 interpretable features on average, Q-SENN shows convincing local and global interpretability. Additionally, we propose a feature alignment method, capable of aligning learned features with human language-based concepts without additional supervision. Thus, what is learned can be more easily verbalized. The code is published: https://github.com/ThomasNorr/Q-SENN

Q-SENN: Quantized Self-Explaining Neural Networks

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription