Ma, Yecheng Jason, Andrew Shen, Osbert Bastani, and Jayaraman Dinesh. “Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (June 28, 2022): 5404–5412. Accessed July 21, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/20478.