PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

Zongzhang Zhang; David Hsu; Wee Sun Lee; Zhan Wei Lim; Aijun Bai

doi:10.1609/icaps.v25i1.13706

Authors

Zongzhang Zhang Soochow University
David Hsu National University of Singapore
Wee Sun Lee National University of Singapore
Zhan Wei Lim National University of Singapore
Aijun Bai University of Science and Technology of China

DOI:

https://doi.org/10.1609/icaps.v25i1.13706

Keywords:

POMDPs, Large Observation Space, Point-based Value Iteration, Heuristics, Efficiency, Palm Leaf Search, Observation Selection

Abstract

Trial-based asynchronous value iteration algorithms for large Partially Observable Markov Decision Processes (POMDPs), such as HSVI2, FSVI and SARSOP, have made impressive progress in the past decade. In the forward exploration phase of these algorithms, only the outcome that has the highest potential impact is searched. This paper provides a novel approach, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate the bound improvements of beliefs in deep levels of the search tree to the root belief because of fewer point-based value backups. Experiments show that PLEASE scales up SARSOP, one of the fastest algorithms, by orders of magnitude on some POMDP tasks with large observation spaces.

PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information