SCORE: Skill-Conditioned Online Reinforcement Learning

Authors

  • Sara Karimi KTH Royal Institute of Technology King.com Ltd.
  • Sahar Asadi King.com Ltd.
  • Amir H. Payberah KTH Royal Institute of Technology

DOI:

https://doi.org/10.1609/aiide.v20i1.31879

Abstract

Solving complex long-horizon tasks through Reinforcement Learning (RL) from scratch presents challenges related to efficient exploration. Two common approaches to reduce complexity and enhance exploration efficiency are (i) integrating learning-from-demonstration techniques with online RL, where the prior knowledge acquired from demonstrations is used to guide exploration, refine representations, or tailor reward functions, and (ii) using representation learning to facilitate state abstraction. In this study, we present Skill-Conditioned Online REinforcement Learning (SCORE), a novel approach that leverages these two strategies and utilizes skills acquired from an unstructured demonstrations dataset in a policy gradient RL algorithm. This integration enriches the algorithm with informative input representations, improving downstream task learning and exploration efficiency. We evaluate our method on long-horizon robotic and navigation tasks and game environments, demonstrating enhancements in online RL performance compared to the baselines. Furthermore, we show our approach’s generalization capabilities and analyze its effectiveness through an ablation study.

Downloads

Published

2024-11-15

How to Cite

Karimi, S., Asadi, S., & Payberah, A. H. (2024). SCORE: Skill-Conditioned Online Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 20(1), 189-198. https://doi.org/10.1609/aiide.v20i1.31879