[1]
C. Yue, “Promoting Efficient Reasoning with Verifiable Stepwise Reward”, AAAI, vol. 40, no. 41, pp. 34530–34538, Mar. 2026.