(1)
Wang, S.; Chen, Z.; Xiao, T.; Lv, Z.; Yang, J.; Cai, X.; Wang, J.; Li, X. Scaling and Transferability of Annealing Strategies in Large Language Model Training. AAAI 2026, 40, 33639-33647.