A Deep Ensemble Method for Multi-Agent Reinforcement Learning: A Case Study on Air Traffic Control


  • Supriyo Ghosh IBM Research AI
  • Sean Laguna IBM Research AI
  • Shiau Hong Lim IBM Research AI
  • Laura Wynter IBM Research AI
  • Hasan Poonawala Amazon Web Services


Integration Of Multiple Planning And Scheduling Techniques, Or Of Planning And Scheduling Techniques With Techniques From Other Areas Or Disciplines, Description And Modeling Of Novel Application Domains


Reinforcement learning (RL), a promising framework for data-driven decision making in an uncertain environment, has successfully been applied in many real-world operation and control problems. However, the application of RL in a large-scale decentralized multi-agent environment remains a challenging problem due to the partial observability and limited communications between agents. In this paper, we develop a model-based kernel RL approach and a model-free deep RL approach for learning a decentralized, shared policy among homogeneous agents. By leveraging the strengths of both these methods, we further propose a novel deep ensemble multi-agent reinforcement learning (MARL) method that efficiently learns to arbitrate between the decisions of the local kernel-based RL model and the wider-reaching deep RL model. We validate the proposed deep ensemble method on a highly challenging real-world air traffic control problem, where the goal is to provide effective guidance to aircraft to avoid air traffic congestion, conflicting situations, and to improve arrival timeliness, by dynamically recommending adjustments of aircraft speeds in real-time. Extensive empirical results from an open-source air traffic management simulation model, developed by Eurocontrol and built on a real-world data set including thousands of aircrafts, demonstrate that our proposed deep ensemble MARL method significantly outperforms three state-of-the-art benchmark approaches.




How to Cite

Ghosh, S., Laguna, S., Lim, S. H., Wynter, L., & Poonawala, H. (2021). A Deep Ensemble Method for Multi-Agent Reinforcement Learning: A Case Study on Air Traffic Control. Proceedings of the International Conference on Automated Planning and Scheduling, 31(1), 468-476. Retrieved from https://ojs.aaai.org/index.php/ICAPS/article/view/15993