Dohmen, Taylor, Noah Topper, George Atia, Andre Beckus, Ashutosh Trivedi, and Alvaro Velasquez. 2022. “Inferring Probabilistic Reward Machines from Non-Markovian Reward Signals for Reinforcement Learning”. Proceedings of the International Conference on Automated Planning and Scheduling 32 (1):574-82. https://doi.org/10.1609/icaps.v32i1.19844.