[1]

T. Hu and B. Luo, “PA2D-MORL: Pareto Ascent Directional Decomposition Based Multi-Objective Reinforcement Learning”, AAAI, vol. 38, no. 11, pp. 12547–12555, Mar. 2024.