Reconstruction Attack-Resistant Inference Paradigm for LLM Cloud Services
DOI:
https://doi.org/10.1609/aaai.v40i21.38853Abstract
Large language models (LLMs) have seen remarkable growth in recent years. To leverage convenient LLM cloud services, users are inevitably to upload their prompts. Additionally, for tasks such as translation, reading comprehension, and summarization, associated files or context are inherently needed, whether or not they contain user privacy information. Despite the rapid progress in LLM capabilities, research on preserving user privacy during inference has been relatively scarce. To this end, this paper conducts some exploratory research in this domain. Firstly, we show that (1) the embedding space of tokens is highly sparse, and (2) LLMs primarily function in the orthogonal subspace of embedding space, these two factors making privacy extremely vulnerable. Then, we analyze the structural characteristics of LLMs and design a distributed privacy-preserving inference paradigm which can effectively resist privacy attacks. Finally, we perform a thorough evaluation of the defended models on mainstream tasks and find that low-bit quantization techniques can be effectively combined with our inference paradigm, achieving a balance between privacy, utility, and runtime memory efficiency.Published
2026-03-14
How to Cite
Ye, Z., Luo, W., Zhou, Q., & Tang, Y. (2026). Reconstruction Attack-Resistant Inference Paradigm for LLM Cloud Services. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17939-17947. https://doi.org/10.1609/aaai.v40i21.38853
Issue
Section
AAAI Technical Track on Humans and AI