Dalal, Gal, Balazs Szorenyi, and Gugan Thoppe. 2020. “A Tale of Two-Timescale Reinforcement Learning With the Tightest Finite-Time Bound”. Proceedings of the AAAI Conference on Artificial Intelligence 34 (04):3701-8. https://doi.org/10.1609/aaai.v34i04.5779.