(1)
Guo, X.; Hu, A.; Zhang, J. Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods. AAAI 2022, 36, 6774-6782.