Learning to Curate Context: Jointly Optimizing Retrieval and Prediction for Multimodal Social Media Popularity

Xovee Xu; Shuojun Lin; Fan Zhou; Jingkuan Song

doi:10.1609/aaai.v40i2.37112

Authors

Xovee Xu University of Electronic Science and Technology of China
Shuojun Lin University of Electronic Science and Technology of China
Fan Zhou University of Electronic Science and Technology of China
Jingkuan Song University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i2.37112

Abstract

Predicting the popularity of user-generated content (UGC) is a crucial but challenging task in social media analysis. While existing retrieval-augmented models enhance predictions by supplying rich contextual information, they remain limited by a fundamental precision-recall dilemma: enlarging the retrieval set increases coverage but introduces noisy, irrelevant context that harms prediction. In this work, we propose a unified framework that learns to retrieve, filter, and predict. Central to our approach is a Mixture-of-Logits-based retrieval module that replaces static similarity metrics with a dynamic, multi-faceted scoring function, enabling the retriever to be directly optimized by the prediction objective. Then an uncertainty-aware filter is designed to perform differentiable subset selection and refine the selected representations using the information bottleneck principle. At last, to enhance predictive robustness, we introduce a confidence-weighted test-time perturbation strategy. By learning to retrieve UGCs that are beneficial for prediction and filtering out uncertainty, our framework provides more relevant and reliable context. Extensive experiments demonstrate that the proposed framework achieves state-of-the-art performance, consistently outperforming strong baselines.

Learning to Curate Context: Jointly Optimizing Retrieval and Prediction for Multimodal Social Media Popularity

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information