PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

Authors

  • Zongzhang Zhang Soochow University
  • David Hsu National University of Singapore
  • Wee Sun Lee National University of Singapore
  • Zhan Wei Lim National University of Singapore
  • Aijun Bai University of Science and Technology of China

DOI:

https://doi.org/10.1609/icaps.v25i1.13706

Keywords:

POMDPs, Large Observation Space, Point-based Value Iteration, Heuristics, Efficiency, Palm Leaf Search, Observation Selection

Abstract

Trial-based asynchronous value iteration algorithms for large Partially Observable Markov Decision Processes (POMDPs), such as HSVI2, FSVI and SARSOP, have made impressive progress in the past decade. In the forward exploration phase of these algorithms, only the outcome that has the highest potential impact is searched. This paper provides a novel approach, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate the bound improvements of beliefs in deep levels of the search tree to the root belief because of fewer point-based value backups. Experiments show that PLEASE scales up SARSOP, one of the fastest algorithms, by orders of magnitude on some POMDP tasks with large observation spaces.

Downloads

Published

2015-04-08

How to Cite

Zhang, Z., Hsu, D., Lee, W. S., Lim, Z. W., & Bai, A. (2015). PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces. Proceedings of the International Conference on Automated Planning and Scheduling, 25(1), 249-257. https://doi.org/10.1609/icaps.v25i1.13706