Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm

Authors

  • Cuihong Li School of Computer Science and Technology, Beijing Jiaotong University.
  • Xiaowen Huang School of Computer Science and Technology, Beijing Jiaotong University, Beijing Key Laboratory of Traffic Data Mining and Embodied Intelligence, Key Laboratory of Big Data & Artificial Intelligence in Transportation, Ministry of Education.
  • Chuanhuan Yin School of Computer Science and Technology, Beijing Jiaotong University.
  • Jitao Sang School of Computer Science and Technology, Beijing Jiaotong University, Beijing Key Laboratory of Traffic Data Mining and Embodied Intelligence, Key Laboratory of Big Data & Artificial Intelligence in Transportation, Ministry of Education.

DOI:

https://doi.org/10.1609/aaai.v40i27.39449

Abstract

Membership Inference Attack (MIA) aims to determine whether a specific data sample was included in the training dataset of a target model. Traditional MIA approaches rely on shadow models to mimic target model behavior, but their effectiveness diminishes for Large Language Model (LLM)-based recommendation systems due to the scale and complexity of training data. This paper introduces a novel knowledge distillation-based MIA paradigm tailored for LLM-based recommendation systems. Our method constructs a reference model via distillation, applying distinct strategies for member and non-member data to enhance discriminative capabilities. The paradigm extracts fused features (e.g., confidence, entropy, loss, and hidden layer vectors) from the reference model to train an attack model, overcoming limitations of individual features. Extensive experiments on extended datasets (Last.FM, MovieLens, Book-Crossing, Delicious) and diverse LLMs (T5, GPT-2, LLaMA3) demonstrate that our approach significantly outperforms shadow model-based MIAs and individual-feature baselines. The results show its practicality for privacy attacks in LLM-driven recommender systems.

Downloads

Published

2026-03-14

How to Cite

Li, C., Huang, X., Yin, C., & Sang, J. (2026). Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm. Proceedings of the AAAI Conference on Artificial Intelligence, 40(27), 22859-22868. https://doi.org/10.1609/aaai.v40i27.39449

Issue

Section

AAAI Technical Track on Machine Learning IV