[1]
T. Lancewicki, A. Rosenberg, and Y. Mansour, “Learning Adversarial Markov Decision Processes with Delayed Feedback”, AAAI, vol. 36, no. 7, pp. 7281-7289, Jun. 2022.