Chen, Y., Zhang, X., Xie, Q., & Zhu, X. (2024). Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11416-11424. https://doi.org/10.1609/aaai.v38i10.29022