(1)

Topin, N.; Milani, S.; Fang, F.; Veloso, M. Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods. AAAI 2021, 35, 9923-9931.