Zhai, Yuanzhao, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Ding Bo, and Huaimin Wang. “Optimistic Model Rollouts for Pessimistic Offline Policy Optimization”. Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 15 (March 24, 2024): 16678–16686. Accessed May 26, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/29607.