Zhang, Guoxi, and Hisashi Kashima. 2023. “Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence 37 (9):11201-9. https://doi.org/10.1609/aaai.v37i9.26326.