LLM Collaborative Filtering: User-Item Graph as New Language
DOI:
https://doi.org/10.1609/aaai.v40i41.40816Abstract
In collaborative filtering, learning effective embeddings for users and items from interaction data remains a central challenge. While recent efforts leverage large language models (LLMs) to enhance collaborative filtering, two critical limitations persist: (1) Efficiency: LLM-based inference is significantly slower than traditional embedding-based search; and (2) Topological Modeling: LLMs struggle to capture graph structures, which are essential for modeling multi-order user-item interactions. To address these limitations, we propose New Language Collaborative Filtering (NLCF), a framework that aligns LLMs with collaborative filtering by conceptualizing user-item graphs as new languages. This approach is based on two key insights: (1) LLMs excel at mastering new languages when trained on suitable corpora, and (2) the empirical conditional probability between tokens in corpora converges to the transition probabilities between nodes in graphs. NLCF translates user-item graphs into corpora, where users and items are treated as tokens. These corpora are used to fine-tune LLMs, and the learned representations are aggregated to construct user and item embeddings that encode multi-order interactions. Unlike methods that deploy LLMs for inference, NLCF distills LLM knowledge learned from corpora into compact embeddings, enabling both efficient training and real-time inference. The framework has been deployed on a billion-scale e-commerce platform for several months. Extensive experiments demonstrate that NLCF outperforms traditional graph CF models and LLM-based baselines while achieving significant training and inference efficiency improvement over LLM-based baselines.Downloads
Published
2026-03-14
How to Cite
Zhou, H., Zhang, Y., Chen, H., Zhang, Q., Shen, Q., Huang, F., & Huang, X. (2026). LLM Collaborative Filtering: User-Item Graph as New Language. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 35103–35111. https://doi.org/10.1609/aaai.v40i41.40816
Issue
Section
AAAI Technical Track on Natural Language Processing VI