Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret
DOI:
https://doi.org/10.1609/aaai.v38i18.30069Keywords:
SO: Combinatorial Optimization, ML: Reinforcement Learning, SO: Learning to SearchAbstract
Deep-reinforcement-learning (DRL) based neural combinatorial optimization (NCO) methods have demonstrated efficiency without relying on the guidance of optimal solutions. As the most mainstream among them, the learning constructive heuristic (LCH) achieves high-quality solutions through a rapid autoregressive solution construction process. However, these LCH-based methods are deficient in convergency, and there is still a performance gap compared to the optimal. Intuitively, learning to regret some steps in the solution construction process is helpful to the training efficiency and network representations. This article proposes a novel regret-based mechanism for an advanced solution construction process. Our method can be applied as a plug-in to any existing LCH-based DRL-NCO method. Experimental results demonstrate the capability of our work to enhance the performance of various NCO models. Results also show that the proposed LCH-Regret outperforms the previous modification methods on several typical combinatorial optimization problems. The code and Supplementary File are available at https://github.com/SunnyR7/LCH-Regret.Downloads
Published
2024-03-24
How to Cite
Sun, R., Zheng, Z., & Wang, Z. (2024). Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret. Proceedings of the AAAI Conference on Artificial Intelligence, 38(18), 20803-20811. https://doi.org/10.1609/aaai.v38i18.30069
Issue
Section
AAAI Technical Track on Search and Optimization