Hypernetworks for Zero-Shot Transfer in Reinforcement Learning

Authors

  • Sahand Rezaei-Shoshtari McGill University Mila - Quebec AI Institute Samsung AI Center Montreal
  • Charlotte Morissette McGill University Samsung AI Center Montreal
  • Francois R. Hogan Samsung AI Center Montreal
  • Gregory Dudek McGill University Samsung AI Center Montreal Mila - Quebec AI Institute
  • David Meger McGill University Samsung AI Center Montreal Mila - Quebec AI Institute

DOI:

https://doi.org/10.1609/aaai.v37i8.26146

Keywords:

ML: Reinforcement Learning Algorithms, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.

Downloads

Published

2023-06-26

How to Cite

Rezaei-Shoshtari, S., Morissette, C., Hogan, F. R., Dudek, G., & Meger, D. (2023). Hypernetworks for Zero-Shot Transfer in Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8), 9579-9587. https://doi.org/10.1609/aaai.v37i8.26146

Issue

Section

AAAI Technical Track on Machine Learning III