Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes

Authors

  • Shun Zhang University of Michigan
  • Edmund Durfee University of Michigan
  • Satinder Singh University of Michigan

DOI:

https://doi.org/10.1609/icaps.v27i1.13833

Abstract

When planning actions to take on behalf of its human operator, a robot might be uncertain about its operator's reward function. We address the problem of how the robot should formulate an (approximately) optimal query to pose to the operator, given how its uncertainty affects which policies it should plan to pursue. We explain how a robot whose queries ask the operator to choose the best from among k choices can, without loss of optimality, restrict consideration to choices only over alternative policies. Further, we present a method for constructing an approximately-optimal policy query that enjoys a performance bound, where the method need not enumerate all policies. Finally, because queries posed to the operator of a robotic system are often expressed in terms of preferences over trajectories rather than policies, we show how our constructed policy query can be projected into the space of trajectory queries. Our empirical results demonstrate that our projection technique can outperform prior techniques for choosing trajectory queries, particularly when the number of trajectories the operator is asked to compare is small.

Downloads

Published

2017-06-05

How to Cite

Zhang, S., Durfee, E., & Singh, S. (2017). Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes. Proceedings of the International Conference on Automated Planning and Scheduling, 27(1), 339-347. https://doi.org/10.1609/icaps.v27i1.13833