[1]

Morimura, T., Osogami, T. and Shirai, T. 2014. Mixing-Time Regularized Policy Gradient. Proceedings of the AAAI Conference on Artificial Intelligence. 28, 1 (Jun. 2014). DOI:https://doi.org/10.1609/aaai.v28i1.9013.