PICE: Polyhedral Complex Informed Counterfactual Explanations
Abstract
Polyhedral geometry can be used to shed light on the behaviour of piecewise linear neural networks, such as ReLU-based architectures. Counterfactual explanations are a popular class of methods for examining model behaviour by comparing a query to the closest point with a different label, subject to constraints. We present a new algorithm, Polyhedral-complex Informed Counterfactual Explanations (PICE), which leverages the decomposition of the piecewise linear neural network into a polyhedral complex to find counterfactuals that are provably minimal in the Euclidean norm and exactly on the decision boundary for any given query. Moreover, we develop variants of the algorithm that target popular counterfactual desiderata such as sparsity, robustness, speed, plausibility, and actionability. We empirically show on four publicly available real-world datasets that our method outperforms other popular techniques to find counterfactuals and adversarial attacks by distance to decision boundary and distance to query. Moreover, we successfully improve our baseline method in the dimensions of the desiderata we target, as supported by experimental evaluations.Downloads
Published
2024-10-16
Issue
Section
Full Archival Papers