(1)
Yue, C.; Dong, C.; Gao, Y.; He, H.; Chai, J.; Lin, W.; Yin, G. Promoting Efficient Reasoning With Verifiable Stepwise Reward. AAAI 2026, 40, 34530-34538.