Rezaeifar, Shideh, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, and Matthieu Geist. 2022. “Offline Reinforcement Learning As Anti-Exploration”. Proceedings of the AAAI Conference on Artificial Intelligence 36 (7):8106-14. https://doi.org/10.1609/aaai.v36i7.20783.