Your Prompts Are Not Safe: Output-Free Membership Inference via Prompt Vectors in Vision-Language Tuning

Yuran Bian; Xiaohan Zhang; Zhiyuan Yu; Changqing Li; Li Pan

doi:10.1609/aaai.v40i42.40839

Authors

Yuran Bian School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China
Xiaohan Zhang School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China
Zhiyuan Yu Washington University in St. Louis
Changqing Li School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China Zhangjiang Institute for Advanced Study, Shanghai, 201203, China
Li Pan School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, Shanghai 200240, China Zhangjiang Institute for Advanced Study, Shanghai, 201203, China

DOI:

https://doi.org/10.1609/aaai.v40i42.40839

Abstract

Prompt tuning enables Vision-Language Models (VLMs) to efficiently adapt to new tasks through learnable prompt vectors. This naturally raises a question: do these prompts leak private information about their training data? While Membership Inference Attacks (MIAs) can quantify this risk, current methods rely on access to model outputs or internal gradients. This limitation prevents a clear assessment of a prompt’s standalone privacy leakage, particularly in deployment scenarios where such information is inaccessible. In this paper, we propose Prompt Intrinsic Privacy Risk Analyzer (PIPRA) to address this gap. As the first output-free MIA, PIPRA leverages open-source pre-trained VLMs to extract features from both prompts and samples within a shared cross-modal semantic space. By employing a contrastive learning-based feature projector to enhance these representations, PIPRA enables a subsequent discriminator to effectively perform membership inference. Extensive experiments across nine benchmark datasets and multiple VLMs show PIPRA achieves an average AUC of 87.58%, significantly outperforming traditional output-dependent methods (77.05%). These findings reveal that prompts pose a substantially greater privacy risk than previously recognized, highlighting the urgent need for prompt-level privacy protection.

Your Prompts Are Not Safe: Output-Free Membership Inference via Prompt Vectors in Vision-Language Tuning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information