Hypernetworks for Zero-Shot Transfer in Reinforcement Learning

Sahand Rezaei-Shoshtari; Charlotte Morissette; Francois R. Hogan; Gregory Dudek; David Meger

doi:10.1609/aaai.v37i8.26146

Authors

Sahand Rezaei-Shoshtari McGill University Mila - Quebec AI Institute Samsung AI Center Montreal
Charlotte Morissette McGill University Samsung AI Center Montreal
Francois R. Hogan Samsung AI Center Montreal
Gregory Dudek McGill University Samsung AI Center Montreal Mila - Quebec AI Institute
David Meger McGill University Samsung AI Center Montreal Mila - Quebec AI Institute

DOI:

https://doi.org/10.1609/aaai.v37i8.26146

Keywords:

ML: Reinforcement Learning Algorithms, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.

Hypernetworks for Zero-Shot Transfer in Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription