Learning Predictive State Representations From Non-Uniform Sampling

Yuri Grinberg; Hossein Aboutalebi; Melanie Lyman-Abramovitch; Borja Balle; Doina Precup

doi:10.1609/aaai.v32i1.11744

Authors

Yuri Grinberg National Research Council of Canada
Hossein Aboutalebi McGill University
Melanie Lyman-Abramovitch McGill University
Borja Balle Amazon Research Cambridge
Doina Precup McGill University

DOI:

https://doi.org/10.1609/aaai.v32i1.11744

Keywords:

Sequential Data Modelling, Reinforcement Learning

Abstract

Predictive state representations (PSR) have emerged as a powerful method for modelling partially observable environments. PSR learning algorithms can build models for predicting all observable variables, or predicting only some of them conditioned on others (e.g., actions or exogenous variables). In the latter case, which we call conditional modelling, the accuracy of different estimates of the conditional probabilities for a fixed dataset can vary significantly, due to the limited sampling of certain conditions. This can have negative consequences on the PSR parameter estimation process, which are not taken into account by the current state-of-the-art PSR spectral learning algorithms. In this paper, we examine closely conditional modelling within the PSR framework. We first establish a new positive but surprisingly non-trivial result: a conditional model can never be larger than the complete model. Then, we address the core shortcoming of existing PSR spectral learning methods for conditional models by incorporating an additional step in the process, which can be seen as a type of matrix denoising. We further refine this objective by adding penalty terms for violations of the system dynamics matrix structure, which improves the PSR predictive performance. Empirical evaluations on both synthetic and real datasets highlight the advantages of the proposed approach.

Learning Predictive State Representations From Non-Uniform Sampling

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription