Planning with Abstract Markov Decision Processes

Authors

  • Nakul Gopalan Brown University
  • Marie desJardins University of Maryland
  • Michael Littman Brown University
  • James MacGlashan Cogitai Incorporated
  • Shawn Squire University of Maryland
  • Stefanie Tellex Brown University
  • John Winder University of Maryland
  • Lawson Wong Brown University

DOI:

https://doi.org/10.1609/icaps.v27i1.13867

Abstract

Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.

Downloads

Published

2017-06-05

How to Cite

Gopalan, N., desJardins, M., Littman, M., MacGlashan, J., Squire, S., Tellex, S., Winder, J., & Wong, L. (2017). Planning with Abstract Markov Decision Processes. Proceedings of the International Conference on Automated Planning and Scheduling, 27(1), 480-488. https://doi.org/10.1609/icaps.v27i1.13867