STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation

Authors

  • Cong Xu East China Normal University
  • Yunhang He East China Normal University
  • Jun Wang East China Normal University
  • Wei Zhang East China Normal University

DOI:

https://doi.org/10.1609/aaai.v39i12.33407

Abstract

While the mining of modalities is the focus of most multimodal recommendation methods, we believe that how to fully utilize both collaborative and multimodal information is pivotal in e-commerce scenarios where, as clarified in this work, the user behaviors are rarely determined entirely by multimodal features. In order to combine the two distinct types of information, some additional challenges are encountered: 1) Modality erasure: Vanilla graph convolution, which proves rather useful in collaborative filtering, however erases multimodal information; 2) Modality forgetting: Multimodal information tends to be gradually forgotten as the recommendation loss essentially facilitates the learning of collaborative information. To this end, we propose a novel approach named STAIR, which employs a novel stepwise graph convolution to enable a co-existence of collaborative and multimodal information in e-commerce recommendation. Besides, it starts with the raw multimodal features as an initialization, and the forgetting problem can be significantly alleviated through constrained embedding updates. As a result, STAIR achieves state-of-the-art recommendation performance on three public e-commerce datasets with minimal computational and memory costs.

Downloads

Published

2025-04-11

How to Cite

Xu, C., He, Y., Wang, J., & Zhang, W. (2025). STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(12), 12899–12907. https://doi.org/10.1609/aaai.v39i12.33407

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II