Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm

Cuihong Li; Xiaowen Huang; Chuanhuan Yin; Jitao Sang

doi:10.1609/aaai.v40i27.39449

Authors

Cuihong Li School of Computer Science and Technology, Beijing Jiaotong University.
Xiaowen Huang School of Computer Science and Technology, Beijing Jiaotong University, Beijing Key Laboratory of Traffic Data Mining and Embodied Intelligence, Key Laboratory of Big Data & Artificial Intelligence in Transportation, Ministry of Education.
Chuanhuan Yin School of Computer Science and Technology, Beijing Jiaotong University.
Jitao Sang School of Computer Science and Technology, Beijing Jiaotong University, Beijing Key Laboratory of Traffic Data Mining and Embodied Intelligence, Key Laboratory of Big Data & Artificial Intelligence in Transportation, Ministry of Education.

DOI:

https://doi.org/10.1609/aaai.v40i27.39449

Abstract

Membership Inference Attack (MIA) aims to determine whether a specific data sample was included in the training dataset of a target model. Traditional MIA approaches rely on shadow models to mimic target model behavior, but their effectiveness diminishes for Large Language Model (LLM)-based recommendation systems due to the scale and complexity of training data. This paper introduces a novel knowledge distillation-based MIA paradigm tailored for LLM-based recommendation systems. Our method constructs a reference model via distillation, applying distinct strategies for member and non-member data to enhance discriminative capabilities. The paradigm extracts fused features (e.g., confidence, entropy, loss, and hidden layer vectors) from the reference model to train an attack model, overcoming limitations of individual features. Extensive experiments on extended datasets (Last.FM, MovieLens, Book-Crossing, Delicious) and diverse LLMs (T5, GPT-2, LLaMA3) demonstrate that our approach significantly outperforms shadow model-based MIAs and individual-feature baselines. The results show its practicality for privacy attacks in LLM-driven recommender systems.

Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information