1.
Gu S, Sel B, Ding Y, Wang L, Lin Q, Jin M, et al. Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation. AAAI [Internet]. 2024 Mar. 24 [cited 2026 May 26];38(19):21099-106. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/30102