XU, Yifan; YE, Xichen; CHEN, Yifan; ZHANG, Qiaosheng. When Human Preferences Flip: An Instance-Dependent Robust Loss for RLHF. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 40, n. 44, p. 38057–38065, 2026. DOI: 10.1609/aaai.v40i44.41143. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/41143. Acesso em: 14 may. 2026.