Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Qi Zhang; Satinder Singh; Edmund Durfee

doi:10.1609/icaps.v27i1.13836

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Authors

Qi Zhang University of Michigan
Satinder Singh University of Michigan
Edmund Durfee University of Michigan

DOI:

https://doi.org/10.1609/icaps.v27i1.13836

Abstract

In cooperative multiagent planning, it can often be beneficial for an agent to make commitments about aspects of its behavior to others, allowing them in turn to plan their own behaviors without taking the agent's detailed behavior into account. Extending previous work in the Bayesian setting, we consider instead a worst-case setting in which the agent has a set of possible environments (MDPs) it could be in, and develop a commitment semantics that allows for probabilistic guarantees on the agent's behavior in any of the environments it could end up facing. Crucially, an agent receives observations (of reward and state transitions) that allow it to potentially eliminate possible environments and thus obtain higher utility by adapting its policy to the history of observations. We develop algorithms and provide theory and some preliminary empirical results showing that they ensure an agent meets its commitments with history-dependent policies while minimizing maximum regret over the possible environments.

Downloads

Published

2017-06-05

How to Cite

Zhang, Q., Singh, S., & Durfee, E. (2017). Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making. Proceedings of the International Conference on Automated Planning and Scheduling, 27(1), 348–356. https://doi.org/10.1609/icaps.v27i1.13836

Download Citation

Issue

Vol. 27 (2017): Twenty-Seventh International Conference on Automated Planning and Scheduling

Section

Main Track

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information