Lopez, R., Dhillon, I. S. and Jordan, M. I. (2021) “Learning from eXtreme Bandit Feedback”, Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), pp. 8732-8740. doi: 10.1609/aaai.v35i10.17058.