Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning

Authors

  • Haotian Fu College of Intelligence and Computing, Tianjin University
  • Hongyao Tang College of Intelligence and Computing, Tianjin University
  • Jianye Hao College of Intelligence and Computing, Tianjin University; Noah’s Ark Lab, Huawei
  • Chen Chen Noah’s Ark Lab, Huawei
  • Xidong Feng Department of Automation, Tsinghua University
  • Dong Li Noah’s Ark Lab, Huawei
  • Wulong Liu Noah’s Ark Lab, Huawei

Keywords:

Reinforcement Learning

Abstract

Context, the embedding of previous collected trajectories, is a powerful construct for Meta-Reinforcement Learning (Meta-RL) algorithms. By conditioning on an effective context, Meta-RL policies can easily generalize to new tasks within a few adaptation steps. We argue that improving the quality of context involves answering two questions: 1. How to train a compact and sufficient encoder that can embed the task-specific information contained in prior trajectories? 2. How to collect informative trajectories of which the corresponding context reflects the specification of tasks? To this end, we propose a novel Meta-RL framework called CCM (Contrastive learning augmented Context-based Meta-RL). We first focus on the contrastive nature behind different tasks and leverage it to train a compact and sufficient context encoder. Further, we train a separate exploration policy and theoretically derive a new information-gain-based objective which aims to collect informative trajectories in a few steps. Empirically, we evaluate our approaches on common benchmarks as well as several complex sparse-reward environments. The experimental results show that CCM outperforms state-of-the-art algorithms by addressing previously mentioned problems respectively.

Downloads

Published

2021-05-18

How to Cite

Fu, H., Tang, H., Hao, J., Chen, C., Feng, X., Li, D., & Liu, W. (2021). Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 7457-7465. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16914

Issue

Section

AAAI Technical Track on Machine Learning I