RLfOLD: Reinforcement Learning from Online Demonstrations in Urban Autonomous Driving

Authors

  • Daniel Coelho Department of Mechanical Engineering, University of Aveiro, 3810-193 Aveiro, Portugal; Intelligent System Associate Laboratory (LASI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal
  • Miguel Oliveira Department of Mechanical Engineering, University of Aveiro, 3810-193 Aveiro, Portugal; Intelligent System Associate Laboratory (LASI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal
  • Vitor Santos Department of Mechanical Engineering, University of Aveiro, 3810-193 Aveiro, Portugal; Intelligent System Associate Laboratory (LASI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal

DOI:

https://doi.org/10.1609/aaai.v38i10.29049

Keywords:

ML: Reinforcement Learning, ML: Applications, ML: Deep Learning Algorithms

Abstract

Reinforcement Learning from Demonstrations (RLfD) has emerged as an effective method by fusing expert demonstrations into Reinforcement Learning (RL) training, harnessing the strengths of both Imitation Learning (IL) and RL. However, existing algorithms rely on offline demonstrations, which can introduce a distribution gap between the demonstrations and the actual training environment, limiting their performance. In this paper, we propose a novel approach, Reinforcement Learning from Online Demonstrations (RLfOLD), that leverages online demonstrations to address this limitation, ensuring the agent learns from relevant and up-to-date scenarios, thus effectively bridging the distribution gap. Unlike conventional policy networks used in typical actor-critic algorithms, RLfOLD introduces a policy network that outputs two standard deviations: one for exploration and the other for IL training. This novel design allows the agent to adapt to varying levels of uncertainty inherent in both RL and IL. Furthermore, we introduce an exploration process guided by an online expert, incorporating an uncertainty-based technique. Our experiments on the CARLA NoCrash benchmark demonstrate the effectiveness and efficiency of RLfOLD. Notably, even with a significantly smaller encoder and a single camera setup, RLfOLD surpasses state-of-the-art methods in this evaluation. These results, achieved with limited resources, highlight RLfOLD as a highly promising solution for real-world applications.

Published

2024-03-24

How to Cite

Coelho, D., Oliveira, M., & Santos, V. (2024). RLfOLD: Reinforcement Learning from Online Demonstrations in Urban Autonomous Driving. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11660-11668. https://doi.org/10.1609/aaai.v38i10.29049

Issue

Section

AAAI Technical Track on Machine Learning I