(1)

Liao, W.; Song, X.; Lu, H. DRIFT: Difference-Aware Reinforcement Through Iterative Fine-Tuning for Language Model. AAAI 2026, 40, 31988-31996.