PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

Authors

  • Qi Guo School of Software Engineering, Xi'an Jiaotong University
  • Xiaojun Jia School of Computer Science and Engineering, Nanyang Technological University
  • Shanmin Pang School of Software Engineering, Xi'an Jiaotong University
  • Simeng Qin Northeastern University
  • Lin Wang Hangzhou Dianzi University
  • Ju Jia Southeast University
  • Yang Liu School of Computer Science and Engineering, Nanyang Technological University
  • Qing Guo Center for Frontier AI Research, A*STAR

DOI:

https://doi.org/10.1609/aaai.v40i6.42439

Abstract

Multimodal Large Language Models (MLLMs) are becoming integral to autonomous driving (AD) systems due to their strong vision-language reasoning capabilities. However, MLLMs are vulnerable to adversarial attacks—particularly adversarial patch attacks—which can pose serious threats in real-world scenarios. Existing patch-based attack methods are primarily designed for object detection models. Due to the more complex architectures and strong reasoning capabilities of MLLMs, these approaches perform poorly when transferred to MLLM-based systems. To address these limitations, we propose PhysPatch, a physically realizable and transferable adversarial patch framework tailored for MLLM-based AD systems. PhysPatch jointly optimizes patch location, shape, and content to enhance attack effectiveness and real-world applicability. It introduces a semantic-based mask initialization strategy for realistic placement, an SVD-based local alignment loss with patch-guided crop-resize to improve transferability, and a potential field-based mask refinement method. Extensive experiments across open-source, commercial, and reasoning-capable MLLMs demonstrate that PhysPatch significantly outperforms state-of-the-art (SOTA) methods in steering MLLM-based AD systems toward target-aligned perception and planning outputs. Moreover, PhysPatch consistently places adversarial patches in physically feasible regions of AD scenes, ensuring strong real-world applicability and deployability.

Published

2026-03-14

How to Cite

Guo, Q., Jia, X., Pang, S., Qin, S., Wang, L., Jia, J., … Guo, Q. (2026). PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4412–4420. https://doi.org/10.1609/aaai.v40i6.42439

Issue

Section

AAAI Technical Track on Computer Vision III