Sample-Based Planning for Continuous Action Markov Decision Processes

Authors

  • Chris Mansley Rutgers University
  • Ari Weinstein Rutgers University
  • Michael Littman Rutgers University

DOI:

https://doi.org/10.1609/icaps.v21i1.13484

Abstract

In this paper, we present a new algorithm that integrates recent advances in solving continuous bandit problems with sample-based rollout methods for planning in Markov Decision Processes (MDPs). Our algorithm, Hierarchical Optimistic Optimization applied to Trees (HOOT) addresses planning in continuous-action MDPs. Empirical results are given that show that the performance of our algorithm meets or exceeds that of a similar discrete action planner by eliminating the problem of manual discretization of the action space.

Downloads

Published

2011-03-22

How to Cite

Mansley, C., Weinstein, A., & Littman, M. (2011). Sample-Based Planning for Continuous Action Markov Decision Processes. Proceedings of the International Conference on Automated Planning and Scheduling, 21(1), 335-338. https://doi.org/10.1609/icaps.v21i1.13484