[1]

Chen, Y., Zhang, X., Xie, Q. and Zhu, X. 2024. Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption. Proceedings of the AAAI Conference on Artificial Intelligence. 38, 10 (Mar. 2024), 11416-11424. DOI:https://doi.org/10.1609/aaai.v38i10.29022.