[1]

M. S. Zhang, M. A. Erdogdu, and A. Garg, “Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings”, AAAI, vol. 36, no. 8, pp. 9066–9073, Jun. 2022.