Wang, C., Huo, Y., Gan, Y., Mu, Y., He, Q., Yang, M., … Xiao, T. (2026). Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 33404–33412. https://doi.org/10.1609/aaai.v40i39.40627