TWINFUZZ: Dual-Model Fuzzing for Robustness Generalization in Deep Learning
DOI:
https://doi.org/10.1609/aaai.v40i44.41063Abstract
Deep learning (DL) models are increasingly deployed in safety-critical applications such as face recognition, autonomous driving, and medical diagnosis. Despite their impressive accuracy, they remain vulnerable to adversarial examples - subtle perturbations that can cause incorrect predictions, i.e., the robustness issues. While adversarial training improves robustness against known attacks, it often fails to generalize to unseen or stronger threats, revealing a critical gap in robustness generalization. In this work, we propose a dual-model fuzzing framework to enhance generalized robustness in DL models. Central to our method is a lightweight metric, the Lagrangian Information Bottleneck (LIB), which guides entropy-based mutation toward semantically meaningful and high-risk regions of the input space. The executor uses a resistant model and a more error-prone vulnerable model; their prediction consistency forms the basis of agreement mining, a label-free oracle for isolating decision-boundary samples. To ensure fuzzing effectiveness, we further introduce a task-driven seed selection strategy (e.g., SSIM for vision) that filters out low-quality inputs. We implement a prototype, TWINFUZZ, and evaluate it on six benchmark datasets and nine DL models. Compared with state-of-the-art testing approaches, TWINFUZZ achieves superior improvements in both training-specific and generalized robustness.Published
2026-03-14
How to Cite
Dai, E., Mo, W., Hu, K., Zhu, X., Xiao, X., Wen, S., … Xiang, Y. (2026). TWINFUZZ: Dual-Model Fuzzing for Robustness Generalization in Deep Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37314–37322. https://doi.org/10.1609/aaai.v40i44.41063
Issue
Section
AAAI Special Track on AI Alignment