Targeted Activation Penalties Help CNNs Ignore Spurious Signals
DOI:
https://doi.org/10.1609/aaai.v38i15.29610Keywords:
ML: Transparent, Interpretable, Explainable ML, CV: Bias, Fairness & Privacy, CV: Interpretability, Explainability, and Transparency, HAI: Human-in-the-loop Machine Learning, ML: Ethics, Bias, and Fairness, PEAI: Safety, Robustness & TrustworthinessAbstract
Neural networks (NNs) can learn to rely on spurious signals in the training data, leading to poor generalisation. Recent methods tackle this problem by training NNs with additional ground-truth annotations of such signals. These methods may, however, let spurious signals re-emerge in deep convolutional NNs (CNNs). We propose Targeted Activation Penalty (TAP), a new method tackling the same problem by penalising activations to control the re-emergence of spurious signals in deep CNNs, while also lowering training times and memory usage. In addition, ground-truth annotations can be expensive to obtain. We show that TAP still works well with annotations generated by pre-trained models as effective substitutes of ground-truth annotations. We demonstrate the power of TAP against two state-of-the-art baselines on the MNIST benchmark and on two clinical image datasets, using four different CNN architectures.Downloads
Published
2024-03-24
How to Cite
Zhang, D., Williams, M., & Toni, F. (2024). Targeted Activation Penalties Help CNNs Ignore Spurious Signals. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16705-16713. https://doi.org/10.1609/aaai.v38i15.29610
Issue
Section
AAAI Technical Track on Machine Learning VI