(1)
Zhang, M. S.; Erdogdu, M. A.; Garg, A. Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings. AAAI 2022, 36, 9066-9073.