Low, S. M., A. Kumar, and S. Sanner. “Sample-Efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 9, June 2022, pp. 9840-8, doi:10.1609/aaai.v36i9.21220.