Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

Authors

  • Guillermo Infante Universitat Pompeu Fabra
  • Anders Jonsson Universitat Pompeu Fabra
  • Vicenç Gómez Universitat Pompeu Fabra

DOI:

https://doi.org/10.1609/aaai.v36i6.20655

Keywords:

Machine Learning (ML)

Abstract

We present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and defines subtasks for moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from non-stationarities induced by high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. We show that our approach is significantly more sample efficient than that of a flat learner and similar hierarchical approaches when the set of boundary states is smaller than the entire state space.

Downloads

Published

2022-06-28

How to Cite

Infante, G., Jonsson, A., & Gómez, V. (2022). Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes. Proceedings of the AAAI Conference on Artificial Intelligence, 36(6), 6970-6977. https://doi.org/10.1609/aaai.v36i6.20655

Issue

Section

AAAI Technical Track on Machine Learning I