Hierarchical Reinforcement Learning with Topology-Aware Exploration Framework for Multi-path Commodity Flow Problem

Authors

  • Jingchen Jiang Beijing Institute of Technology
  • Xuan Zhou Beijing Institute of Technology
  • Jiayuan Li Beijing Institute of Technology Zhongguancun Academy
  • Geng Han Beijing Institute of Technology
  • Xiang Shi Tsinghua University
  • Fang Deng Beijing Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v40i43.40947

Abstract

The multi-path commodity flow problem (MPCFP) is crucial for ensuring reliable and high-speed data transmission in communication networks. However, existing studies that employ pre-generated routing paths neglect real-time load state and the coupling among decisions, thus hindering the achievement of high-quality solutions. To overcome this, we propose Hierarchical Reinforcement Learning with Topology-Aware Exploration (HRL-TAE), which is the first fully end-to-end framework that dynamically produces high-quality solutions based on real-time network states. HRL-TAE integrates an exploration mechanism and utilizes the State Transition Guiding List (STGL) to guide state transitions, thereby transforming topology exploration into a Markov decision process. Guided by STGL, two closely coupled layers in HRL-TAE, that is, the path construct layer and the ratio allocate layer, construct multiple subpaths for each flow and allocate traffic ratios among them. Subsequently, adaptive constraint-driven masks exclude infeasible actions during decision making, thereby guaranteeing that all constraints are satisfied. We also adopt a tailored training approach to obtain accurate gradient estimates and improve training efficiency. Simulations and real-world experiments demonstrate that HRL-TAE achieves superior performance.

Downloads

Published

2026-03-14

How to Cite

Jiang, J., Zhou, X., Li, J., Han, G., Shi, X., & Deng, F. (2026). Hierarchical Reinforcement Learning with Topology-Aware Exploration Framework for Multi-path Commodity Flow Problem. Proceedings of the AAAI Conference on Artificial Intelligence, 40(43), 36280–36288. https://doi.org/10.1609/aaai.v40i43.40947

Issue

Section

AAAI Technical Track on Planning, Routing, and Scheduling