Zhang, Z., Duan, M., Ye, Y., & Zhang, H. R. (2026). Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(34), 28609–28617. https://doi.org/10.1609/aaai.v40i34.40092