Pan, Junshu, Wei Shen, Shulin Huang, Qiji Zhou, and Yue Zhang. 2026. “Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (38):32646-54. https://doi.org/10.1609/aaai.v40i38.40542.