Dohmen, T., N. Topper, G. Atia, A. Beckus, A. Trivedi, and A. Velasquez. “Inferring Probabilistic Reward Machines from Non-Markovian Reward Signals for Reinforcement Learning”. Proceedings of the International Conference on Automated Planning and Scheduling, vol. 32, no. 1, June 2022, pp. 574-82, doi:10.1609/icaps.v32i1.19844.