Shani, Lior, Yonathan Efroni, and Shie Mannor. 2020. “Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs”. Proceedings of the AAAI Conference on Artificial Intelligence 34 (04):5668-75. https://doi.org/10.1609/aaai.v34i04.6021.