[1]
X. Guo, A. Hu, and J. Zhang, “Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods”, AAAI, vol. 36, no. 6, pp. 6774-6782, Jun. 2022.