Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
DOI:
https://doi.org/10.1609/aaai.v31i1.11065Keywords:
Reinforcement Learning, Transfer Learning, Latent Variable Models, Gaussian Process Dynamical ModelAbstract
An intriguing application of transfer learning emerges when tasks arise with similar, but not identical, dynamics. Hidden Parameter Markov Decision Processes (HiP-MDP) embed these tasks into a low-dimensional space; given the embedding parameters one can identify the MDP for a particular task. However, the original formulation of HiP-MDP had a critical flaw: the embedding uncertainty was modeled independently of the agent's state uncertainty, requiring an arduous training procedure. In this work, we apply a Gaussian Process latent variable model to jointly model the dynamics and the embedding, leading to a more elegant formulation, one that allows for better uncertainty quantification and thus more robust transfer.