(1)
Hu, T.; Luo, B. PA2D-MORL: Pareto Ascent Directional Decomposition Based Multi-Objective Reinforcement Learning. AAAI 2024, 38, 12547-12555.