Zhai, Yuanzhao, Tingkai Yang, Kele Xu, Dawei Feng, Cheng Yang, Bo Ding, and Huaimin Wang. “Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 25 (April 11, 2025): 27161–27169. Accessed May 31, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/34924.