(1)
Lou, X.; Zhang, J.; Xie, J.; Liu, L.; Yan, D.; Huang, K. Sequential Preference Optimization: Multi-Dimensional Preference Alignment With Implicit Reward Modeling. AAAI 2025, 39, 27509-27517.