Your Prompts Are Not Safe: Output-Free Membership Inference via Prompt Vectors in Vision-Language Tuning

Authors

  • Yuran Bian School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China
  • Xiaohan Zhang School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China
  • Zhiyuan Yu Washington University in St. Louis
  • Changqing Li School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China Zhangjiang Institute for Advanced Study, Shanghai, 201203, China
  • Li Pan School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China Zhangjiang Institute for Advanced Study, Shanghai, 201203, China

DOI:

https://doi.org/10.1609/aaai.v40i42.40839

Abstract

Prompt tuning enables Vision-Language Models (VLMs) to efficiently adapt to new tasks through learnable prompt vectors. This naturally raises a question: do these prompts leak private information about their training data? While Membership Inference Attacks (MIAs) can quantify this risk, current methods rely on access to model outputs or internal gradients. This limitation prevents a clear assessment of a prompt’s standalone privacy leakage, particularly in deployment scenarios where such information is inaccessible. In this paper, we propose Prompt Intrinsic Privacy Risk Analyzer (PIPRA) to address this gap. As the first output-free MIA, PIPRA leverages open-source pre-trained VLMs to extract features from both prompts and samples within a shared cross-modal semantic space. By employing a contrastive learning-based feature projector to enhance these representations, PIPRA enables a subsequent discriminator to effectively perform membership inference. Extensive experiments across nine benchmark datasets and multiple VLMs show PIPRA achieves an average AUC of 87.58%, significantly outperforming traditional output-dependent methods (77.05%). These findings reveal that prompts pose a substantially greater privacy risk than previously recognized, highlighting the urgent need for prompt-level privacy protection.

Downloads

Published

2026-03-14

How to Cite

Bian, Y., Zhang, X., Yu, Z., Li, C., & Pan, L. (2026). Your Prompts Are Not Safe: Output-Free Membership Inference via Prompt Vectors in Vision-Language Tuning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 35313–35321. https://doi.org/10.1609/aaai.v40i42.40839

Issue

Section

AAAI Technical Track on Philosophy and Ethics of AI