Yang, L., Zheng, Q., & Pan, G. (2021). Sample Complexity of Policy Gradient Finding Second-Order Stationary Points. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10630–10638. https://doi.org/10.1609/aaai.v35i12.17271