[1]
Y. Xu, X. Ye, Y. Chen, and Q. Zhang, “When Human Preferences Flip: An Instance-Dependent Robust Loss for RLHF”, AAAI, vol. 40, no. 44, pp. 38057–38065, Mar. 2026.