Zhang, Matthew S., Murat A Erdogdu, and Animesh Garg. 2022. “Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings”. Proceedings of the AAAI Conference on Artificial Intelligence 36 (8):9066-73. https://doi.org/10.1609/aaai.v36i8.20891.