Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates

Jason Pazis; Ronald Parr

doi:10.1609/aaai.v30i1.10307

Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates

Authors

Jason Pazis Massachusetts Institute of Technology
Ronald Parr Duke University

DOI:

https://doi.org/10.1609/aaai.v30i1.10307

Keywords:

MDP, exploration, PAC, optimal, concurrent, efficient, delayed

Abstract

We present a new, efficient PAC optimal exploration algorithm that is able to explore in multiple, continuous or discrete state MDPs simultaneously. Our algorithm does not assume that value function updates can be completed instantaneously, and maintains PAC guarantees in realtime environments. Not only do we extend the applicability of PAC optimal exploration algorithms to new, realistic settings, but even when instant value function updates are possible, our bounds present a significant improvement over previous single MDP exploration bounds, and a drastic improvement over previous concurrent PAC bounds. We also present TCE, a new, fine grained metric for the cost of exploration.

Downloads

Published

2016-03-02

How to Cite

Pazis, J., & Parr, R. (2016). Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10307

Download Citation

Issue

Vol. 30 No. 1 (2016): Thirtieth AAAI Conference on Artificial Intelligence

Section

Technical Papers: Machine Learning Methods

Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription