Liao, W., Song, X., & Lu, H. (2026). DRIFT: Difference-Aware Reinforcement Through Iterative Fine-Tuning for Language Model. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 31988–31996. https://doi.org/10.1609/aaai.v40i38.40469