[1]
Xu, Z., Gavran, I., Ahmad, Y., Majumdar, R., Neider, D., Topcu, U. and Wu, B. 2020. Joint Inference of Reward Machines and Policies for Reinforcement Learning. Proceedings of the International Conference on Automated Planning and Scheduling. 30, 1 (Jun. 2020), 590-598. DOI:https://doi.org/10.1609/icaps.v30i1.6756.