Lancewicki, T., Rosenberg, A. and Mansour, Y. (2022) “Learning Adversarial Markov Decision Processes with Delayed Feedback”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), pp. 7281-7289. doi: 10.1609/aaai.v36i7.20690.