Mandal, Debmalya, Goran Radanovic, Jiarui Gan, Adish Singla, and Rupak Majumdar. 2023. “Online Reinforcement Learning With Uncertain Episode Lengths”. Proceedings of the AAAI Conference on Artificial Intelligence 37 (7):9064-71. https://doi.org/10.1609/aaai.v37i7.26088.