Deep Reinforcement Learning for Scalable Offline Three-Dimensional Packing

Hao Yin; Hongjie He; Fan Chen

doi:10.1609/aaai.v40i33.40009

Authors

Hao Yin Southwest Jiaotong University
Hongjie He Southwest Jiaotong University
Fan Chen Southwest Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v40i33.40009

Abstract

With the increasing number of items requiring handling simultaneously in complex logistics, offline three-dimensional packing methods need to plan larger numbers of items. Existing deep reinforcement learning (DRL)-based packing methods cannot plan for large numbers of items while keeping high-quality solutions due to limited exploration space and high computational complexity. To address this issue, this paper proposes a scalable DRL-based packing method. An attention-based pack-Q-network (PQNet) is constructed to learn the optimal packing policy by integrating unpacked items, available spaces, and packed items. To expand the valid exploration space, a bidding-based multi-policy (BBMP) framework composed of multiple PQNets is designed to efficiently explore more latent valid solutions, thus enhancing solution quality. To reduce computational complexity, a training-free dynamic candidate selection (DCS) framework is proposed to incorporate comprehensive item information during execution with minimal computation overhead, which helps in effectively planning large numbers of items. Experimental results show that across item numbers of 20~1000, our method consistently outperforms the best-performing baseline at each tested scale by 3.2%~13.1% in space utilization.

Deep Reinforcement Learning for Scalable Offline Three-Dimensional Packing

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information