Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming

Orso Forghieri; Hind Castel; Emmanuel Hyon; Erwan Le Pennec

doi:10.1609/icaps.v34i1.31479

Authors

Orso Forghieri CMAP, École Polytechnique, Institut Polytechnique de Paris
Hind Castel Télécom SudParis, Institut Polytechnique de Paris
Emmanuel Hyon Sorbonne Université Université Paris Nanterre
Erwan Le Pennec CMAP, École Polytechnique, Institut Polytechnique de Paris

DOI:

https://doi.org/10.1609/icaps.v34i1.31479

Abstract

High dimensionality of model-based Reinforcement Learning and Markov Decision Processes can be reduced using abstractions of the state and action spaces. Although hierarchical learning and state abstraction methods have been explored over the past decades, explicit methods to build useful abstractions of models are rarely provided. In this work, we provide a new state abstraction method for solving infinite horizon problems in the discounted and total settings. Our approach is to progressively disaggregate abstract regions by iteratively slicing aggregations of states relatively to a value function. The distinguishing feature of our method, in contrast to previous approximations of the Bellman operator, is the disaggregation of regions during value function iterations (or policy evaluation steps). The objective is to find a more efficient aggregation that reduces the error on each piece of the partition. We provide a proof of convergence for this algorithm without making any assumptions about the structure of the problem. We also show that this process decreases the computational complexity of the Bellman operator iteration and provides useful abstractions. We then plug this state space disaggregation process in classical Dynamic Programming algorithm namely Approximate Value Iteration, Q-Value Iteration and Policy Iteration. Finally, we conduct a numerical comparison on randomly generated MDPs as well as classical MDPs. Those experiments show that our policy-based algorithm is faster than both traditional dynamic programming approach and recent aggregative methods that use a fixed number of adaptive partitions.

Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information