Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes

Shun Zhang; Edmund Durfee; Satinder Singh

doi:10.1609/icaps.v27i1.13833

Authors

Shun Zhang University of Michigan
Edmund Durfee University of Michigan
Satinder Singh University of Michigan

DOI:

https://doi.org/10.1609/icaps.v27i1.13833

Abstract

When planning actions to take on behalf of its human operator, a robot might be uncertain about its operator's reward function. We address the problem of how the robot should formulate an (approximately) optimal query to pose to the operator, given how its uncertainty affects which policies it should plan to pursue. We explain how a robot whose queries ask the operator to choose the best from among k choices can, without loss of optimality, restrict consideration to choices only over alternative policies. Further, we present a method for constructing an approximately-optimal policy query that enjoys a performance bound, where the method need not enumerate all policies. Finally, because queries posed to the operator of a robotic system are often expressed in terms of preferences over trajectories rather than policies, we show how our constructed policy query can be projected into the space of trajectory queries. Our empirical results demonstrate that our projection technique can outperform prior techniques for choosing trajectory queries, particularly when the number of trajectories the operator is asked to compare is small.

Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information