Morimura, T., T. Osogami, and T. Shirai. “Mixing-Time Regularized Policy Gradient”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, no. 1, June 2014, doi:10.1609/aaai.v28i1.9013.