Improving the Efficiency and Efficacy of Multi-Agent Reinforcement Learning on Complex Railway Networks with a Local-Critic Approach

Authors

  • Yuan Zhang Neurorobotics Lab, University of Freiburg, Germany
  • Umashankar Deekshith Deutsche Bahn AG, Germany
  • Jianhong Wang Center for AI Fundamentals, University of Manchester, UK
  • Joschka Boedecker Neurorobotics Lab, University of Freiburg, Germany

DOI:

https://doi.org/10.1609/icaps.v34i1.31533

Abstract

The complex railway network is a challenging real-world multi-agent system usually involving thousands of agents. Current planning methods heavily depend on expert knowledge to formulate solutions for specific cases and are therefore hardly generalized to new scenarios, on which multi-agent reinforcement learning (MARL) draws significant attention. Despite some successful applications in multi-agent decision-making tasks, MARL is hard to scale to a large number of agents. This paper rethinks the curse of agents in the centralized-training-decentralized-execution (CTDE) paradigm and proposes a local-critic approach to address the issue. By combining the local critic with the PPO algorithm, we design a deep MARL algorithm denoted as local-critic PPO (LCPPO). In experiments, we evaluate the effectiveness of LCPPO on a complex railway network benchmark, Flatland, with various numbers of agents. Noticeably, LCPPO shows prominent generalizability and robustness under the changes of environments.

Downloads

Published

2024-05-30

How to Cite

Zhang, Y., Deekshith, U., Wang, J., & Boedecker, J. (2024). Improving the Efficiency and Efficacy of Multi-Agent Reinforcement Learning on Complex Railway Networks with a Local-Critic Approach. Proceedings of the International Conference on Automated Planning and Scheduling, 34(1), 698-706. https://doi.org/10.1609/icaps.v34i1.31533