TVChain: Leveraging Textual-Visual Prompt Chains for Jailbreaking Large Vision-Language Models
DOI:
https://doi.org/10.1609/aaai.v40i33.40018Abstract
Large Vision-Language Models (LVLMs) enhance the capabilities of Large Language Models by integrating visual inputs, thereby enabling advanced multimodal reasoning across diverse applications. However, these enhanced reasoning capabilities introduce new security risks, particularly to jailbreaking attacks that bypass built-in safety mechanisms to elicit harmful or unauthorized outputs. While recent efforts have explored adversarial and typographic prompts, most existing attacks suffer from three key limitations: reliance on auxiliary models, limited effectiveness in black-box scenarios, and inadequate exploitation of the LVLMs' intrinsic reasoning abilities. In this work, we propose TVChain, a novel black-box jailbreaking framework that explicitly intervenes in both the visual and textual reasoning processes of LVLMs. TVChain decomposes malicious prompts into a sequence of semantically meaningful sub-images that represent relevant objects and behaviors, thereby circumventing direct exposure of illicit content. In parallel, a carefully designed chain-of-thought (CoT) textual prompt is employed to steer the model's reasoning toward reconstructing the intended activity in a covert yet effective manner. We demonstrate that this compositional prompting strategy reduces the likelihood of triggering safety mechanisms while preserving attack efficacy. Extensive evaluations on eleven LVLMs (seven open-source and four commercial) across two benchmark datasets and three state-of-the-art defenses validate the effectiveness and robustness of TVChain.Downloads
Published
2026-03-14
How to Cite
Yu, H., Liang, K., Duan, J., Wang, J., Wang, S., Ma, C., & Liu, X. (2026). TVChain: Leveraging Textual-Visual Prompt Chains for Jailbreaking Large Vision-Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 27943–27951. https://doi.org/10.1609/aaai.v40i33.40018
Issue
Section
AAAI Technical Track on Machine Learning X