Dukkipati, Ambedkar, Ranga Shaarad Ayyagari, Bodhisattwa Dasgupta, Parag Dutta, and Prabhas Reddy Onteru. “Active Reinforcement Learning Strategies for Offline Policy Improvement”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 16 (April 11, 2025): 16418–16425. Accessed May 8, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/33803.