(1)
Liu, Z.; Tu, J.; Hong, Y.; Xiong, L.; Jin, Y.; Tang, Y.; Li, F. HCPO: Hierarchical Conductor-Based Policy Optimization in Multi-Agent Reinforcement Learning. AAAI 2026, 40, 29564-29572.