[1]
Y. Sun, Z. Zhao, Y. Wei, Y. Zhang, and C. Gong, “Well Begun, Half Done: Reinforcement Learning with Prefix Optimization for LLM Reasoning”, AAAI, vol. 40, no. 39, pp. 33144–33152, Mar. 2026.