Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing

Authors

  • Jinmin He Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Kai Li Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Yifan Zang Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Haobo Fu Tencent AI Lab
  • Qiang Fu Tencent AI Lab
  • Junliang Xing Tsinghua University
  • Jian Cheng Institute of Automation, Chinese Academy of Sciences School of Future Technology, University of Chinese Academy of Sciences AiRiA

DOI:

https://doi.org/10.1609/aaai.v38i11.29129

Keywords:

ML: Reinforcement Learning

Abstract

Multi-task reinforcement learning endeavors to accomplish a set of different tasks with a single policy. To enhance data efficiency by sharing parameters across multiple tasks, a common practice segments the network into distinct modules and trains a routing network to recombine these modules into task-specific policies. However, existing routing approaches employ a fixed number of modules for all tasks, neglecting that tasks with varying difficulties commonly require varying amounts of knowledge. This work presents a Dynamic Depth Routing (D2R) framework, which learns strategic skipping of certain intermediate modules, thereby flexibly choosing different numbers of modules for each task. Under this framework, we further introduce a ResRouting method to address the issue of disparate routing paths between behavior and target policies during off-policy training. In addition, we design an automatic route-balancing mechanism to encourage continued routing exploration for unmastered tasks without disturbing the routing of mastered ones. We conduct extensive experiments on various robotics manipulation tasks in the Meta-World benchmark, where D2R achieves state-of-the-art performance with significantly improved learning efficiency.

Published

2024-03-24

How to Cite

He, J., Li, K., Zang, Y., Fu, H., Fu, Q., Xing, J., & Cheng, J. (2024). Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12376-12384. https://doi.org/10.1609/aaai.v38i11.29129

Issue

Section

AAAI Technical Track on Machine Learning II