Chen, Y., X. Zhang, Q. Xie, and X. Zhu. “Exact Policy Recovery in Offline RL With Both Heavy-Tailed Rewards and Data Corruption”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 10, Mar. 2024, pp. 11416-24, doi:10.1609/aaai.v38i10.29022.