Hu, Tianmeng, and Biao Luo. 2024. “PA2D-MORL: Pareto Ascent Directional Decomposition Based Multi-Objective Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (11):12547-55. https://doi.org/10.1609/aaai.v38i11.29148.