[1]

De Asis, K., Chan, A., Pitis, S., Sutton, R. and Graves, D. 2020. Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 34, 04 (Apr. 2020), 3741-3748. DOI:https://doi.org/10.1609/aaai.v34i04.5784.