Sample-Based Planning for Continuous Action Markov Decision Processes
DOI:
https://doi.org/10.1609/icaps.v21i1.13484Abstract
In this paper, we present a new algorithm that integrates recent advances in solving continuous bandit problems with sample-based rollout methods for planning in Markov Decision Processes (MDPs). Our algorithm, Hierarchical Optimistic Optimization applied to Trees (HOOT) addresses planning in continuous-action MDPs. Empirical results are given that show that the performance of our algorithm meets or exceeds that of a similar discrete action planner by eliminating the problem of manual discretization of the action space.
Downloads
Published
2011-03-22
How to Cite
Mansley, C., Weinstein, A., & Littman, M. (2011). Sample-Based Planning for Continuous Action Markov Decision Processes. Proceedings of the International Conference on Automated Planning and Scheduling, 21(1), 335-338. https://doi.org/10.1609/icaps.v21i1.13484
Issue
Section
Short Papers