Knox, W. B., Hatgis-Kessell, S., Adalgeirsson, S. O., Booth, S., Dragan, A., Stone, P., & Niekum, S. (2024). Learning Optimal Advantage from Preferences and Mistaking It for Reward. Proceedings of the AAAI Conference on Artificial Intelligence, 38(9), 10066-10073. https://doi.org/10.1609/aaai.v38i9.28870