[1]
D. Mandal, G. Radanovic, J. Gan, A. Singla, and R. Majumdar, “Online Reinforcement Learning with Uncertain Episode Lengths”, AAAI, vol. 37, no. 7, pp. 9064-9071, Jun. 2023.