Chen, Z., & Tan, V. Y. F. (2026). On the Exponential Convergence for Offline RLHF with Pairwise Comparisons. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37277–37285. https://doi.org/10.1609/aaai.v40i44.41059