Zhang, Matthew S., et al. “Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 8, June 2022, pp. 9066-73, doi:10.1609/aaai.v36i8.20891.