[1]

G. Dalal, B. Szorenyi, and G. Thoppe, “A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound”, AAAI, vol. 34, no. 04, pp. 3701-3708, Apr. 2020.