MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents

Yun Xing; Nhat Chung; Jie Zhang; Yue Cao; Ivor Tsang; Yang Liu; Lei Ma; Qing Guo

doi:10.1609/aaai.v40i13.38095

Authors

Yun Xing University of Alberta, Canada Nankai University, China A*STAR, Singapore
Nhat Chung National University of Singapore, Singapore A*STAR, Singapore
Jie Zhang A*STAR, Singapore
Yue Cao Nanyang Technological University, Singapore A*STAR, Singapore
Ivor Tsang Nanyang Technological University, Singapore A*STAR, Singapore, Singapore
Yang Liu Nanyang Technological University, Singapore
Lei Ma University of Alberta, Canada The University of Tokyo, Japan
Qing Guo Nankai University, China

DOI:

https://doi.org/10.1609/aaai.v40i13.38095

Abstract

Physical adversarial attacks in driving scenarios can expose critical vulnerabilities in visual perception models. However, developing such attacks remains non-trivial due to diverse real-world environmental influences. Existing approaches either struggle to generalize to dynamic environments or fail to achieve consistent physical attack performance. To address these challenges, we propose MAGIC (Mastering Physical Adversarial Generation In Context), a novel framework powered by multi-modal LLM agents to automatically understand the scene context during testing time and generate adversarial patches through synergistic interaction of language and vision understanding. Specifically, MAGIC orchestrates three specialized LLM agents: the adv-patch generation agent masters the creation of deceptive patches via strategic prompt manipulation for text-to-image models; the adv-patch deployment agent ensures contextual coherence by determining optimal deployment strategies based on scene understanding; and the self-examination agent completes this trilogy by providing critical oversight and iterative refinement of both processes. We validate our approach with both digital and physical scenarios, i.e., nuImage and real-world scenes, where both statistical and visual results demonstrate that our MAGIC is powerful and effective for attacking widely applied object detection systems, such as YOLO and DETR series.

MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information