Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning

Authors

  • Jinxin Liu Zhejiang University Westlake University Westlake Institute for Advanced Study
  • Donglin Wang Westlake University Westlake Institute for Advanced Study
  • Qiangxing Tian Zhejiang University Westlake University Westlake Institute for Advanced Study
  • Zhengyu Chen Zhejiang University Westlake University Westlake Institute for Advanced Study

DOI:

https://doi.org/10.1609/aaai.v36i7.20721

Keywords:

Machine Learning (ML)

Abstract

It is of significance for an agent to autonomously explore the environment and learn a widely applicable and general-purpose goal-conditioned policy that can achieve diverse goals including images and text descriptions. Considering such perceptually-specific goals, one natural approach is to reward the agent with a prior non-parametric distance over the embedding spaces of states and goals. However, this may be infeasible in some situations, either because it is unclear how to choose suitable measurement, or because embedding (heterogeneous) goals and states is non-trivial. The key insight of this work is that we introduce a latent-conditioned policy to provide goals and intrinsic rewards for learning the goal-conditioned policy. As opposed to directly scoring current states with regards to goals, we obtain rewards by scoring current states with associated latent variables. We theoretically characterize the connection between our unsupervised objective and the multi-goal setting, and empirically demonstrate the effectiveness of our proposed method which substantially outperforms prior techniques in a variety of tasks.

Downloads

Published

2022-06-28

How to Cite

Liu, J., Wang, D., Tian, Q., & Chen, Z. (2022). Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), 7558-7566. https://doi.org/10.1609/aaai.v36i7.20721

Issue

Section

AAAI Technical Track on Machine Learning II