[1]

Low, S.M., Kumar, A. and Sanner, S. 2022. Sample-Efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs. Proceedings of the AAAI Conference on Artificial Intelligence. 36, 9 (Jun. 2022), 9840-9848. DOI:https://doi.org/10.1609/aaai.v36i9.21220.