Morimura, Tetsuro, Takayuki Osogami, and Tomoyuki Shirai. 2014. “Mixing-Time Regularized Policy Gradient”. Proceedings of the AAAI Conference on Artificial Intelligence 28 (1). https://doi.org/10.1609/aaai.v28i1.9013.