1.
Chen Y, Zhang X, Xie Q, Zhu X. Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption. AAAI [Internet]. 2024Mar.24 [cited 2026Apr.23];38(10):11416-24. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/29022