MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents
DOI:
https://doi.org/10.1609/aaai.v40i13.38095Abstract
Physical adversarial attacks in driving scenarios can expose critical vulnerabilities in visual perception models. However, developing such attacks remains non-trivial due to diverse real-world environmental influences. Existing approaches either struggle to generalize to dynamic environments or fail to achieve consistent physical attack performance. To address these challenges, we propose MAGIC (Mastering Physical Adversarial Generation In Context), a novel framework powered by multi-modal LLM agents to automatically understand the scene context during testing time and generate adversarial patches through synergistic interaction of language and vision understanding. Specifically, MAGIC orchestrates three specialized LLM agents: the adv-patch generation agent masters the creation of deceptive patches via strategic prompt manipulation for text-to-image models; the adv-patch deployment agent ensures contextual coherence by determining optimal deployment strategies based on scene understanding; and the self-examination agent completes this trilogy by providing critical oversight and iterative refinement of both processes. We validate our approach with both digital and physical scenarios, i.e., nuImage and real-world scenes, where both statistical and visual results demonstrate that our MAGIC is powerful and effective for attacking widely applied object detection systems, such as YOLO and DETR series.Downloads
Published
2026-03-14
How to Cite
Xing, Y., Chung, N., Zhang, J., Cao, Y., Tsang, I., Liu, Y., … Guo, Q. (2026). MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents. Proceedings of the AAAI Conference on Artificial Intelligence, 40(13), 11159–11168. https://doi.org/10.1609/aaai.v40i13.38095
Issue
Section
AAAI Technical Track on Computer Vision X