Dalal, Gal, Balazs Szorenyi, and Gugan Thoppe. “A Tale of Two-Timescale Reinforcement Learning With the Tightest Finite-Time Bound”. Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3701-3708. Accessed March 4, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/5779.