Learning-Guided Simulated Annealing for the Capacitated Vehicle Routing Problem
DOI:
https://doi.org/10.1609/icaps.v36i1.42875Abstract
The Capacitated Vehicle Routing Problem (CVRP) is a fun- damental combinatorial optimization challenge with broad industrial relevance. Classical metaheuristics such as Simu- lated Annealing (SA) offer asymptotic convergence guaran- tees but suffer from inefficient random neighborhood explo- ration. Conversely, recent deep learning approaches generate solutions rapidly but often struggle to generalize beyond the instance sizes encountered during training. In this paper, we bridge this gap by proposing Learning-Guided Simulated An- nealing (LG-SA), a hybrid framework that augments SA with a very lightweight neural module, trained to select promis- ing neighboring solutions instead of relying on uniform sam- pling. The move-selection policy is learned through rein- forcement learning (RL) using Proximal Policy Optimization (PPO). Through extensive experiments, we analyze the sta- bility of the model as well as the impact of diverse feasibil- ity mechanisms, initialization strategies, neighborhood oper- ators, and action parameterizations (joint vs. conditional). We further show that LG-SA excels at finding high-quality solu- tions rapidly. In addition to achieving a 42.5% cost reduction over classical SA and generalizing well to larger instances, LG-SA outperforms or performs on par with widely used methods like OR-Tools, attention-based GNNs, and advanced RL methods within comparable or shorter timeframes.Downloads
Published
2026-06-08
How to Cite
Andretti, J., Cabessa, J., & Strozecki, Y. (2026). Learning-Guided Simulated Annealing for the Capacitated Vehicle Routing Problem. Proceedings of the International Conference on Automated Planning and Scheduling, 36(1), 572–580. https://doi.org/10.1609/icaps.v36i1.42875