YUE, Chuhuai; DONG, Chengqi; GAO, Yinan; HE, Hang; CHAI, Jiajun; LIN, Wei; YIN, Guojun. Promoting Efficient Reasoning with Verifiable Stepwise Reward. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 40, n. 41, p. 34530–34538, 2026. DOI: 10.1609/aaai.v40i41.40752. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/40752. Acesso em: 14 may. 2026.