Eguard: Defending LLM Embeddings Against Inversion Attacks via Text Mutual Information Optimization

Tiantian Liu; Hongwei Yao; Feng Lin; Tong Wu; Zhan Qin; Kui Ren

doi:10.1609/aaai.v40i42.40876

Authors

Tiantian Liu State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security School of Informatics, Xiamen University
Hongwei Yao State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security
Feng Lin State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security
Tong Wu State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security
Zhan Qin State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security
Kui Ren State Key Laboratory of Blockchain and Data Security, Zhejiang University Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security

DOI:

https://doi.org/10.1609/aaai.v40i42.40876

Abstract

While text embeddings enable efficient semantic processing in LLMs, they remain vulnerable to inversion attacks that reconstruct sensitive original text. However, current defense methods typically treat text embeddings from the feature level independently, ignoring the exploitation of the mutual relation among the embedding construction pipeline. To address this limitation, we propose Eguard, a framework that effectively disrupts chains of relationships between the original semantic space and defended functional space. Our improvements manifest at two levels, i.e., the global-level and local-level mutual information. At the global level, we propose to minimize the statistical dependency between protected embeddings and their original inputs, effectively decoupling sensitive content from the semantic space accessible to adversaries. At the local level, we apply keyword-antonym contrastive learning to enforce semantic discriminability within the space of downstream utility. This synergy of global privacy control and local semantic alignment allows Eguard to achieve a superior privacy-utility trade-off than traditional defenses. Our approach significantly reduces privacy risks, protecting over 95 percent of tokens from inversion while maintaining high performance across downstream tasks consistent with original embeddings.

Eguard: Defending LLM Embeddings Against Inversion Attacks via Text Mutual Information Optimization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information