Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems
Keywords:Reinforcement Learning, Heuristic Learning, Optimal Planning, Deep Learning
AbstractRecent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a new improvement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar and better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into subproblems by clustering sequentially nodes in clockwise order, and then learning to solve them simultaneously. Our approaches enhance state-of-the-art CVRP solvers while attaining competitive solution quality on several well-known datasets, including real-world instances with sizes up to 30,000 nodes. Our best methods are able to achieve new state-of-the-art solutions for several large instances and generalize to a wide range of CVRP variants and solvers. We also contribute new datasets and results to test the generalizability of our deep RL algorithms.
How to Cite
Bui, V., & Mai, T. (2023). Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems. Proceedings of the International Conference on Automated Planning and Scheduling, 33(1), 551-559. https://doi.org/10.1609/icaps.v33i1.27236
Planning and Learning Track