Hu T, Luo B. PA2D-MORL: Pareto Ascent Directional Decomposition Based Multi-Objective Reinforcement Learning. AAAI [Internet]. 2024 Mar. 24 [cited 2026 May 14];38(11):12547-55. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/29148