Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems


  • Viet Bui Singapore Management University
  • Tien Mai Singapore Management University



Reinforcement Learning, Heuristic Learning, Optimal Planning, Deep Learning


Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a new improvement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar and better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into subproblems by clustering sequentially nodes in clockwise order, and then learning to solve them simultaneously. Our approaches enhance state-of-the-art CVRP solvers while attaining competitive solution quality on several well-known datasets, including real-world instances with sizes up to 30,000 nodes. Our best methods are able to achieve new state-of-the-art solutions for several large instances and generalize to a wide range of CVRP variants and solvers. We also contribute new datasets and results to test the generalizability of our deep RL algorithms.




How to Cite

Bui, V., & Mai, T. (2023). Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems. Proceedings of the International Conference on Automated Planning and Scheduling, 33(1), 551-559.