Xu, Haoran, Xianyuan Zhan, and Xiangyu Zhu. “Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8753-8760. Accessed July 20, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/20855.