False Positives Matter: Multidimensional Localization Evaluation and Training-Free Explainable Adversarial Patch Defense

Lihua Jing; Rui Wang; Jinwen Zhong; Runbo Li; Zixuan Zhu

doi:10.1609/aaai.v40i7.37474

Authors

Lihua Jing Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences
Rui Wang Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences
Jinwen Zhong Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences
Runbo Li Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences
Zixuan Zhu Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i7.37474

Abstract

Adversarial patch attacks pose a significant threat to visual systems. While current patch purification-based defense methods enhance core metrics of visual perception models, they overlook the critical issue of false positive patches, severely compromising image usability. This paper reveals the inadequacy of existing evaluations for adversarial patch defenses, and pioneers a multidimensional adversarial patch localization evaluation framework, which comprehensively quantifies false positives, recall capability, and overall localization accuracy, providing a novel perspective for comparative analysis within the field. Furthermore, building upon the observation that false positives stem from a lack of semantic understanding, we propose a Semantic-Aware Training-free Explainable Defense method (SATED). SATED achieves zero-shot patch localization, false detection correction, and decision explanation by constructing a patch reasoning chain, while simultaneously performing integrated text-guided patch inpainting. Extensive experiments across digital and physical scenarios, detection and segmentation tasks, and diverse adversarial patches, demonstrate that our method significantly reduces false positives and doubles the overall patch localization accuracy, boosting both the generalizability and explainability of the defense.

False Positives Matter: Multidimensional Localization Evaluation and Training-Free Explainable Adversarial Patch Defense

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information