VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models

Ziyi Yin; Muchao Ye; Tianrong Zhang; Jiaqi Wang; Han Liu; Jinghui Chen; Ting Wang; Fenglong Ma

doi:10.1609/aaai.v38i7.28499

Authors

Ziyi Yin The Pennsylvania State University
Muchao Ye The Pennsylvania State University
Tianrong Zhang The Pennsylvania State University
Jiaqi Wang The Pennsylvania State University
Han Liu Dalian University of Technology
Jinghui Chen The Pennsylvania State University
Ting Wang Stony Brook University
Fenglong Ma The Pennsylvania State University

DOI:

https://doi.org/10.1609/aaai.v38i7.28499

Keywords:

CV: Adversarial Attacks & Robustness, CV: Language and Vision

Abstract

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pre-trained multimodal source model to create adversarial image-text pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both im- age and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representation-based loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer anti-recovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with state-of-the-art baselines. This work reveals a significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. The source code can be found in the link https://github.com/ericyinyzy/VQAttack.

VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription