Xiao, T., & Wang, S. (2022). Towards Off-Policy Learning for Ranking Policies with Logged Feedback. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), 8700-8707. https://doi.org/10.1609/aaai.v36i8.20849