Wang, S. (2026) “Scaling and Transferability of Annealing Strategies in Large Language Model Training”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(40), pp. 33639–33647. doi: 10.1609/aaai.v40i40.40653.