Reconstruction Attack-Resistant Inference Paradigm for LLM Cloud Services

Authors

  • Zipeng Ye Harbin Institute of Technology
  • Wenjian Luo Harbin Institute of Technology
  • Qi Zhou Harbin Institute of Technology
  • Yubo Tang Harbin Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v40i21.38853

Abstract

Large language models (LLMs) have seen remarkable growth in recent years. To leverage convenient LLM cloud services, users are inevitably to upload their prompts. Additionally, for tasks such as translation, reading comprehension, and summarization, associated files or context are inherently needed, whether or not they contain user privacy information. Despite the rapid progress in LLM capabilities, research on preserving user privacy during inference has been relatively scarce. To this end, this paper conducts some exploratory research in this domain. Firstly, we show that (1) the embedding space of tokens is highly sparse, and (2) LLMs primarily function in the orthogonal subspace of embedding space, these two factors making privacy extremely vulnerable. Then, we analyze the structural characteristics of LLMs and design a distributed privacy-preserving inference paradigm which can effectively resist privacy attacks. Finally, we perform a thorough evaluation of the defended models on mainstream tasks and find that low-bit quantization techniques can be effectively combined with our inference paradigm, achieving a balance between privacy, utility, and runtime memory efficiency.

Downloads

Published

2026-03-14

How to Cite

Ye, Z., Luo, W., Zhou, Q., & Tang, Y. (2026). Reconstruction Attack-Resistant Inference Paradigm for LLM Cloud Services. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17939-17947. https://doi.org/10.1609/aaai.v40i21.38853

Issue

Section

AAAI Technical Track on Humans and AI