Dohmen, Taylor, Noah Topper, George Atia, Andre Beckus, Ashutosh Trivedi, and Alvaro Velasquez. “Inferring Probabilistic Reward Machines from Non-Markovian Reward Signals for Reinforcement Learning”. Proceedings of the International Conference on Automated Planning and Scheduling 32, no. 1 (June 13, 2022): 574-582. Accessed April 25, 2024. https://ojs.aaai.org/index.php/ICAPS/article/view/19844.