Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret

Authors

  • Rui Sun Southern University of Science and Technology
  • Zhi Zheng Southern University of Science and Technology
  • Zhenkun Wang Southern University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v38i18.30069

Keywords:

SO: Combinatorial Optimization, ML: Reinforcement Learning, SO: Learning to Search

Abstract

Deep-reinforcement-learning (DRL) based neural combinatorial optimization (NCO) methods have demonstrated efficiency without relying on the guidance of optimal solutions. As the most mainstream among them, the learning constructive heuristic (LCH) achieves high-quality solutions through a rapid autoregressive solution construction process. However, these LCH-based methods are deficient in convergency, and there is still a performance gap compared to the optimal. Intuitively, learning to regret some steps in the solution construction process is helpful to the training efficiency and network representations. This article proposes a novel regret-based mechanism for an advanced solution construction process. Our method can be applied as a plug-in to any existing LCH-based DRL-NCO method. Experimental results demonstrate the capability of our work to enhance the performance of various NCO models. Results also show that the proposed LCH-Regret outperforms the previous modification methods on several typical combinatorial optimization problems. The code and Supplementary File are available at https://github.com/SunnyR7/LCH-Regret.

Downloads

Published

2024-03-24

How to Cite

Sun, R., Zheng, Z., & Wang, Z. (2024). Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret. Proceedings of the AAAI Conference on Artificial Intelligence, 38(18), 20803-20811. https://doi.org/10.1609/aaai.v38i18.30069

Issue

Section

AAAI Technical Track on Search and Optimization