RAO*: An Algorithm for Chance-Constrained POMDP's

Authors

  • Pedro Rodrigues Quemel e Assis Santana Massachusetts Institute of Technology
  • Sylvie Thiébaux The Australian National University and NICTA
  • Brian Williams Massachusetts Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v30i1.10423

Keywords:

chance constraint, POMDP, heuristic search

Abstract

Autonomous agents operating in partially observable stochastic environments often face the problem of optimizing expected performance while bounding the risk of violating safety constraints. Such problems can be modeled as chance-constrained POMDP's (CC-POMDP's). Our first contribution is a systematic derivation of execution risk in POMDP domains, which improves upon how chance constraints are handled in the constrained POMDP literature. Second, we present RAO*, a heuristic forward search algorithm producing optimal, deterministic, finite-horizon policies for CC-POMDP's. In addition to the utility heuristic, RAO* leverages an admissible execution risk heuristic to quickly detect and prune overly-risky policy branches. Third, we demonstrate the usefulness of RAO* in two challenging domains of practical interest: power supply restoration and autonomous science agents.

Downloads

Published

2016-03-05

How to Cite

Rodrigues Quemel e Assis Santana, P., Thiébaux, S., & Williams, B. (2016). RAO*: An Algorithm for Chance-Constrained POMDP’s. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10423

Issue

Section

Technical Papers: Reasoning under Uncertainty