DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation

Authors

  • Fanshen Meng Beijing University of Posts and Telecommunications
  • Zhenhua Meng Beijing University of Posts and Telecommunications
  • Ru Jin Beijing University of Posts and Telecommunications
  • Rongheng Lin Beijing University of Posts and Telecommunications
  • Budan Wu Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v39i12.33351

Abstract

In recent years, there has been a burgeoning interest in multimodal recommender systems within the recommendation systems domain. These systems aim to understand user preferences by leveraging both user interaction data and multimodal information associated with items. This approach frequently results in superior recommendation accuracy compared to traditional models that rely solely on user-item interactions. Despite the advancements of these methods, there is a relatively low utilization of image features in propagating item-item characteristics, an overreliance on text feature similarity, and a frequent neglect of the deep relationships between items, users, and modalities. In response to these challenges, we introduce a novel model termed LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation (DOGE). DOGE utilizes large language models (LLMs) to understand image information under the guidance of text information, generating cross-modal features that effectively enhance the relationship between text and image modalities. Subsequently, DOGE constructs a Hyper-Knowledge Graph (HKG) using user-item interaction information and modality features enhanced by large language models. This graph encompasses a wide range of item-item and user-user binary relations and hyper-relations, effectively expanding the feature propagation mechanisms and mitigating the overreliance on text modality. By learning on heterogeneous user-item graphs and homogeneous item-item, user-user graphs, DOGE enhances potential effective propagation between item features and user features, acquiring more effective feature representations of users and items. Comprehensive experimentation across three public real-world datasets illustrates that DOGE attains state-of-the-art (SOTA) performance, exhibiting a 7.2% improvement over the strongest baseline.

Downloads

Published

2025-04-11

How to Cite

Meng, F., Meng, Z., Jin, R., Lin, R., & Wu, B. (2025). DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(12), 12399–12407. https://doi.org/10.1609/aaai.v39i12.33351

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II