Gu, Shangding, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Ming Jin, and Alois Knoll. 2024. “Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (19):21099-106. https://doi.org/10.1609/aaai.v38i19.30102.