Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Authors

  • Paul Strang EDF R&D, France CNAM, France
  • Zacharie Alès ENSTA IP Paris, France CNAM, France
  • Côme Bissuel EDF R&D, France
  • Olivier Juan EDF R&D, France
  • Safia Kedad-Sidhoum CNAM, France
  • Emmanuel Rachelson ISAE-SUPAERO, France

DOI:

https://doi.org/10.1609/aaai.v40i30.39759

Abstract

Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B&B). A key driver influencing B&B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B&B setting, aiming to learn branching strategies tailored to specific MILP distributions. In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS). Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB&B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of the B&B dynamics to discover improved branching strategies. Computational experiments empirically validate our approach, with our MBRL branching agent outperforming previous state-of-the-art RL methods across four standard MILP benchmarks.

Downloads

Published

2026-03-14

How to Cite

Strang, P., Alès, Z., Bissuel, C., Juan, O., Kedad-Sidhoum, S., & Rachelson, E. (2026). Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(30), 25627–25635. https://doi.org/10.1609/aaai.v40i30.39759

Issue

Section

AAAI Technical Track on Machine Learning VII