Topin, N., S. Milani, F. Fang, and M. Veloso. “Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 11, May 2021, pp. 9923-31, doi:10.1609/aaai.v35i11.17192.