Stabilizing Cross-Modal Bidirectional Attribution: Few-Shot Adversarial Prompt Tuning for Robust Vision-Language Models
DOI:
https://doi.org/10.1609/aaai.v40i5.37396Abstract
Large-scale pre-trained vision-language models (VLMs) like CLIP show exceptional performance and zero-shot generalization. However, their reliability may be severely undermined by a critical vulnerability to subtle adversarial perturbations. Our work reveals a critical cross-modal vulnerability: visual-only perturbations induce substantial, synchronous shifts in decision attribution maps across both image and text. This phenomenon signifies a fundamental disruption of the VLM's internal logic, as it alters both the model's perceptual focus and its decision rationale. To counter this vulnerability, we introduce Cross-modal Bidirectional Attribution guided Few-shot Adversarial Prompt Tuning (CBA-FAPT), a novel method that leverages the model's internal decision rationale as a regularizer for robust learning. Our framework's core mechanism is the alignment of a novel bidirectional attribution map. This map is a unique fusion of two components. It combines forward feature attention to capture the model's perceptual focus. It also incorporates backward decision gradients to act as a proxy for the model's decision rationale, quantifying how each feature influences the final outcome. We enforce consistency on this bidirectional map between clean and adversarial examples. This approach corrects the model's internal logic on two fronts and effectively restores its adversarial robustness. Comprehensive experiments on 11 datasets demonstrate that CBA-FAPT outperforms the state-of-the-art, establishing a superior trade-off between robust and natural accuracy.Downloads
Published
2026-03-14
How to Cite
Feng, J., Wu, S., Sun, H., Zhang, P., Ren, B., & Zhang, S. (2026). Stabilizing Cross-Modal Bidirectional Attribution: Few-Shot Adversarial Prompt Tuning for Robust Vision-Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(5), 3939–3947. https://doi.org/10.1609/aaai.v40i5.37396
Issue
Section
AAAI Technical Track on Computer Vision II