MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents

Authors

  • Yun Xing University of Alberta, Canada Nankai University, China A*STAR, Singapore
  • Nhat Chung National University of Singapore, Singapore A*STAR, Singapore
  • Jie Zhang A*STAR, Singapore
  • Yue Cao Nanyang Technological University, Singapore A*STAR, Singapore
  • Ivor Tsang Nanyang Technological University, Singapore A*STAR, Singapore, Singapore
  • Yang Liu Nanyang Technological University, Singapore
  • Lei Ma University of Alberta, Canada The University of Tokyo, Japan
  • Qing Guo Nankai University, China

DOI:

https://doi.org/10.1609/aaai.v40i13.38095

Abstract

Physical adversarial attacks in driving scenarios can expose critical vulnerabilities in visual perception models. However, developing such attacks remains non-trivial due to diverse real-world environmental influences. Existing approaches either struggle to generalize to dynamic environments or fail to achieve consistent physical attack performance. To address these challenges, we propose MAGIC (Mastering Physical Adversarial Generation In Context), a novel framework powered by multi-modal LLM agents to automatically understand the scene context during testing time and generate adversarial patches through synergistic interaction of language and vision understanding. Specifically, MAGIC orchestrates three specialized LLM agents: the adv-patch generation agent masters the creation of deceptive patches via strategic prompt manipulation for text-to-image models; the adv-patch deployment agent ensures contextual coherence by determining optimal deployment strategies based on scene understanding; and the self-examination agent completes this trilogy by providing critical oversight and iterative refinement of both processes. We validate our approach with both digital and physical scenarios, i.e., nuImage and real-world scenes, where both statistical and visual results demonstrate that our MAGIC is powerful and effective for attacking widely applied object detection systems, such as YOLO and DETR series.

Downloads

Published

2026-03-14

How to Cite

Xing, Y., Chung, N., Zhang, J., Cao, Y., Tsang, I., Liu, Y., … Guo, Q. (2026). MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents. Proceedings of the AAAI Conference on Artificial Intelligence, 40(13), 11159–11168. https://doi.org/10.1609/aaai.v40i13.38095

Issue

Section

AAAI Technical Track on Computer Vision X