Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

Authors

  • Thomas Keller University of Freiburg
  • Florian Geißer University of Freiburg

DOI:

https://doi.org/10.1609/aaai.v29i1.9698

Keywords:

Optimal Stopping Problem, Secretary Problem, MDP, Planning under Uncertainty, IPPC, UCT

Abstract

We introduce the MDP-Evaluation Stopping Problem, the optimization problem faced by participants of the International Probabilistic Planning Competition 2014 that focus on their own performance. It can be constructed as a meta-MDP where actions correspond to the application of a policy on a base-MDP, which is intractable in practice. Our theoretical analysis reveals that there are tractable special cases where the problem can be reduced to an optimal stopping problem. We derive approximate strategies of high quality by relaxing the general problem to an optimal stopping problem, and show both theoretically and experimentally that it not only pays off to pursue luck in the execution of the optimal policy, but that there are even cases where it is better to be lucky than good as the execution of a suboptimal base policy is part of an optimal strategy in the meta-MDP.

Downloads

Published

2015-03-04

How to Cite

Keller, T., & Geißer, F. (2015). Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9698

Issue

Section

AAAI Technical Track: Reasoning under Uncertainty