[1]
Y. Chen, X. Zhang, Q. Xie, and X. Zhu, “Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption”, AAAI, vol. 38, no. 10, pp. 11416-11424, Mar. 2024.