A Dynamics and Task Decoupled Reinforcement Learning Architecture for High-Efficiency Dynamic Target Intercept

Authors

  • Dora D. Liu DeepBlue Academy of Sciences BirenTech Research
  • Liang Hu Tongji University DeepBlue Academy of Sciences
  • Qi Zhang University of Technology Sydney DeepBlue Academy of Sciences
  • Tangwei Ye DeepBlue Academy of Sciences
  • Usman Naseem University of Sydney
  • Zhong Yuan Lai DeepBlue Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v37i10.26421

Keywords:

PRS: Applications, ROB: Motion and Path Planning, ML: Reinforcement Learning Algorithms, PRS: Optimization of Spatio-Temporal Systems, PRS: Other Foundations of Planning, Routing & Scheduling, PRS: Planning Under Uncertainty

Abstract

Due to the flexibility and ease of control, unmanned aerial vehicles (UAVs) have been increasingly used in various scenarios and applications in recent years. Training UAVs with reinforcement learning (RL) for a specific task is often expensive in terms of time and computation. However, it is known that the main effort of the learning process is made to fit the low-level physical dynamics systems instead of the high-level task itself. In this paper, we study to apply UAVs in the dynamic target intercept (DTI) task, where the dynamics systems equipped by different UAV models are correspondingly distinct. To this end, we propose a dynamics and task decoupled RL architecture to address the inefficient learning procedure, where the RL module focuses on modeling the DTI task without involving physical dynamics, and the design of states, actions, and rewards are completely task-oriented while the dynamics control module can adaptively convert actions from the RL module to dynamics signals to control different UAVs without retraining the RL module. We show the efficiency and efficacy of our results in comparison and ablation experiments against state-of-the-art methods.

Downloads

Published

2023-06-26

How to Cite

Liu, D. D., Hu, L., Zhang, Q., Ye, T., Naseem, U., & Lai, Z. Y. (2023). A Dynamics and Task Decoupled Reinforcement Learning Architecture for High-Efficiency Dynamic Target Intercept. Proceedings of the AAAI Conference on Artificial Intelligence, 37(10), 12049-12057. https://doi.org/10.1609/aaai.v37i10.26421

Issue

Section

AAAI Technical Track on Planning, Routing, and Scheduling