Pan, J. (2026) “Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), pp. 32646–32654. doi: 10.1609/aaai.v40i38.40542.