[1]

G. Neustroev, M. de Weerdt, and R. Verzijlbergh, “Discovery of Optimal Solution Horizons in Non-Stationary Markov Decision Processes with Unbounded Rewards”, ICAPS, vol. 29, no. 1, pp. 292-300, May 2021.