(1)

Xu, Z.; Gavran, I.; Ahmad, Y.; Majumdar, R.; Neider, D.; Topcu, U.; Wu, B. Joint Inference of Reward Machines and Policies for Reinforcement Learning. ICAPS 2020, 30, 590-598.