Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

Authors

  • Tanzim Ahad University of Texas at El Paso
  • Ismail Hossain University of Texas at El Paso
  • Md Jahangir Alam University of Texas at El Paso
  • Sai Puppala Southern Illinois University Carbondale
  • Yoonpyo Lee Hanyang University
  • Syed Bahauddin Alam University of Illinois Urbana-Champaign
  • Sajedul Talukder University of Texas at El Paso

DOI:

https://doi.org/10.1609/aaaiss.v9i1.42936

Abstract

We introduce Semantic Intent Fragmentation (SIF), a new attack class against large language model (LLM) orchestration systems. In SIF, a single legitimately-phrased enterprise request causes an LLM orchestrator to autonomously decompose a task into subtasks that are individually benign but jointly violate security policy. Because all deployed safety mechanisms evaluate individual subtasks in isolation, each step passes existing classifiers while the harmful outcome emerges only when the plan is considered as a whole, a structural blind spot we term the plan-generation gap. Unlike prior multi-agent attacks, SIF requires no injected content, no system modification, and no attacker interaction after the initial request, a property we term single-shot autonomy. We formalise this vulnerability with the Fragmentation Score (FS) and prove, without distributional assumptions, that no per-subtask classifier upgrade can close it. In a 14-scenario empirical study spanning financial, security, and HR domains, 71% of enterprise requests produce policy-violating plans even though every individual subtask passes six independent classifier families. A Compositional Intent Verifier (CIV), a plan-level LLM judge that checks for cross-subtask policy violations, combined with information-flow control (IFC) taint analysis detects all confirmed attacks at 0% false-positive rate, demonstrating that pre-dispatch plan-level evaluation is both necessary and sufficient to close the gap.

Downloads

Published

2026-06-23

How to Cite

Ahad, T., Hossain, I., Alam, M. J., Puppala, S., Lee, Y., Alam, S. B., & Talukder, S. (2026). Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines. Proceedings of the AAAI Symposium Series, 9(1), 229–237. https://doi.org/10.1609/aaaiss.v9i1.42936

Issue

Section

Human-Aware AI Agents for the Cyber Battlefield: From Human Models to Autonomous Defense (Full Papers)