Metcalf, Katherine, Miguel Sarabia, Masha Fedzechkina, and Barry-John Theobald. “Can You Rely on Synthetic Labellers in Preference-Based Reinforcement Learning? It’s Complicated”. Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 9 (March 24, 2024): 10128–10136. Accessed May 19, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/28877.