(1)

Xiao, T.; Wang, S. Towards Off-Policy Learning for Ranking Policies With Logged Feedback. AAAI 2022, 36, 8700-8707.