(1)
Zhang, G.; Wang, Y.; Chen, X.; Qian, H.; Zhan, K.; Wang, B. UNEX-RL: Reinforcing Long-Term Rewards in Multi-Stage Recommender Systems With UNidirectional EXecution. AAAI 2024, 38, 9305-9313.