Revisiting Fairness-aware Interactive Recommendation: Item Lifecycle as a Control Knob

Yun Lu; Xiaoyu Shi; Hong Xie; Chongjun Xia; Zhenhui Gong; Mingsheng Shang

doi:10.1609/aaai.v40i18.38575

Authors

Yun Lu Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences Chongqing School, University of Chinese Academy of Sciences
Xiaoyu Shi Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences Chongqing School, University of Chinese Academy of Sciences
Hong Xie The First Affiliated Hospital, University of Science and Technology of China
Chongjun Xia Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences Chongqing School, University of Chinese Academy of Sciences
Zhenhui Gong Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences Chongqing School, University of Chinese Academy of Sciences
Mingsheng Shang Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences Chongqing School, University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i18.38575

Abstract

This paper revisits fairness-aware interactive recommendation (e.g., TikTok, KuaiShou) by introducing a novel control knob, i.e., the lifecycle of items. We make threefold contributions. First, we conduct a comprehensive empirical analysis and uncover that item lifecycles in short-video platforms follow a compressed three-phase pattern, i.e., rapid growth, transient stability, and sharp decay, which significantly deviates from the classical four-stage model (introduction, growth, maturity, decline). Second, we introduce LHRL, a lifecycle-aware hierarchical reinforcement learning framework that dynamically harmonizes fairness and accuracy by leveraging phase-specific exposure dynamics. LHRL consists of two key components: (1) PhaseFormer, a lightweight encoder combining STL decomposition and attention mechanisms for robust phase detection; (2) a two-level HRL agent, where the high-level policy imposes phase-aware fairness constraints, and the low-level policy optimizes immediate user engagement. This decoupled optimization allows for effective reconciliation between long-term equity and short-term utility. Third, experiments on multiple real-world interactive recommendation datasets demonstrate that LHRL significantly improves both fairness and user engagement. Furthermore, the integration of lifecycle-aware rewards into existing RL-based models consistently yields performance gains, highlighting the generalizability and practical value of our approach.

Revisiting Fairness-aware Interactive Recommendation: Item Lifecycle as a Control Knob

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information