REGLO: Provable Neural Network Repair for Global Robustness Properties
DOI:
https://doi.org/10.1609/aaai.v38i11.29094Keywords:
ML: Ethics, Bias, and Fairness, ML: Adversarial Learning & Robustness, ML: PrivacyAbstract
We present REGLO, a novel methodology for repairing pretrained neural networks to satisfy global robustness and individual fairness properties. A neural network is said to be globally robust with respect to a given input region if and only if all the input points in the region are locally robust. This notion of global robustness also captures the notion of individual fairness as a special case. We prove that any counterexample to a global robustness property must exhibit a corresponding large gradient. For ReLU networks, this result allows us to efficiently identify the linear regions that violate a given global robustness property. By formulating and solving a suitable robust convex optimization problem, REGLO then computes a minimal weight change that will provably repair these violating linear regions.Downloads
Published
2024-03-24
How to Cite
Fu, F., Wang, Z., Zhou, W., Wang, Y., Fan, J., Huang, C., Zhu, Q., Chen, X., & Li, W. (2024). REGLO: Provable Neural Network Repair for Global Robustness Properties. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12061-12071. https://doi.org/10.1609/aaai.v38i11.29094
Issue
Section
AAAI Technical Track on Machine Learning II