[1]

Knox, W.B., Hatgis-Kessell, S., Adalgeirsson, S.O., Booth, S., Dragan, A., Stone, P. and Niekum, S. 2024. Learning Optimal Advantage from Preferences and Mistaking It for Reward. Proceedings of the AAAI Conference on Artificial Intelligence. 38, 9 (Mar. 2024), 10066-10073. DOI:https://doi.org/10.1609/aaai.v38i9.28870.