De Asis, K., Chan, A., Pitis, S., Sutton, R., & Graves, D. (2020). Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 3741-3748. https://doi.org/10.1609/aaai.v34i04.5784