Gu, Shangding, et al. “Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 19, Mar. 2024, pp. 21099-06, doi:10.1609/aaai.v38i19.30102.