XU, Z.; GAVRAN, I.; AHMAD, Y.; MAJUMDAR, R.; NEIDER, D.; TOPCU, U.; WU, B. Joint Inference of Reward Machines and Policies for Reinforcement Learning. Proceedings of the International Conference on Automated Planning and Scheduling, [S. l.], v. 30, n. 1, p. 590-598, 2020. Disponível em: https://ojs.aaai.org/index.php/ICAPS/article/view/6756. Acesso em: 28 feb. 2021.