PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

Qi Guo; Xiaojun Jia; Shanmin Pang; Simeng Qin; Lin Wang; Ju Jia; Yang Liu; Qing Guo

doi:10.1609/aaai.v40i6.42439

Authors

Qi Guo School of Software Engineering, Xi'an Jiaotong University
Xiaojun Jia School of Computer Science and Engineering, Nanyang Technological University
Shanmin Pang School of Software Engineering, Xi'an Jiaotong University
Simeng Qin Northeastern University
Lin Wang Hangzhou Dianzi University
Ju Jia Southeast University
Yang Liu School of Computer Science and Engineering, Nanyang Technological University
Qing Guo Center for Frontier AI Research, A*STAR

DOI:

https://doi.org/10.1609/aaai.v40i6.42439

Abstract

Multimodal Large Language Models (MLLMs) are becoming integral to autonomous driving (AD) systems due to their strong vision-language reasoning capabilities. However, MLLMs are vulnerable to adversarial attacks—particularly adversarial patch attacks—which can pose serious threats in real-world scenarios. Existing patch-based attack methods are primarily designed for object detection models. Due to the more complex architectures and strong reasoning capabilities of MLLMs, these approaches perform poorly when transferred to MLLM-based systems. To address these limitations, we propose PhysPatch, a physically realizable and transferable adversarial patch framework tailored for MLLM-based AD systems. PhysPatch jointly optimizes patch location, shape, and content to enhance attack effectiveness and real-world applicability. It introduces a semantic-based mask initialization strategy for realistic placement, an SVD-based local alignment loss with patch-guided crop-resize to improve transferability, and a potential field-based mask refinement method. Extensive experiments across open-source, commercial, and reasoning-capable MLLMs demonstrate that PhysPatch significantly outperforms state-of-the-art (SOTA) methods in steering MLLM-based AD systems toward target-aligned perception and planning outputs. Moreover, PhysPatch consistently places adversarial patches in physically feasible regions of AD scenes, ensuring strong real-world applicability and deployability.

PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information