(1)
Zhu, X.; He, Y.; Hou, H.; Zhang, R.; Zeng, N.; Peng, Y.; Fang, J.; Yu, F. R. PSPO: Prompt-Level Prioritization and Experience-Weighted Smoothing for Efficient Policy Optimization. AAAI 2026, 40, 29186-29194.