Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy

Authors

  • Fan-Ming Luo Nanjing University
  • Shengyi Jiang Nanjing University
  • Yang Yu Nanjing University Polixir Technologies
  • ZongZhang Zhang Nanjing University Alibaba Group
  • Yi-Feng Zhang Nanjing University

DOI:

https://doi.org/10.1609/aaai.v36i7.20730

Keywords:

Machine Learning (ML)

Abstract

Dealing with real-world reinforcement learning (RL) tasks, we shall be aware that the environment may have sudden changes. We expect that a robust policy is able to handle such changes and adapt to the new environment rapidly. Context-based meta reinforcement learning aims at learning environment adaptable policies. These methods adopt a context encoder to perceive the environment on-the-fly, following which a contextual policy makes environment adaptive decisions according to the context. However, previous methods show lagged and unstable context extraction, which are hard to handle sudden changes well. This paper proposes an environment sensitive contextual policy learning (ESCP) approach, in order to improve both the sensitivity and the robustness of context encoding. ESCP is composed of three key components: variance minimization that forces a rapid and stable encoding of the environment context, relational matrix determinant maximization that avoids trivial solutions, and a history-truncated recurrent neural network model that avoids old memory interference. We use a grid-world task and 5 locomotion controlling tasks with changing parameters to empirically assess our algorithm. Experiment results show that in environments with both in-distribution and out-of-distribution parameter changes, ESCP can not only better recover the environment encoding, but also adapt more rapidly to the post-change environment (10x faster in the grid-world) while the return performance is kept or improved, compared with state-of-the-art meta RL methods.

Downloads

Published

2022-06-28

How to Cite

Luo, F.-M., Jiang, S., Yu, Y., Zhang, Z., & Zhang, Y.-F. (2022). Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), 7637-7646. https://doi.org/10.1609/aaai.v36i7.20730

Issue

Section

AAAI Technical Track on Machine Learning II