LB-DESPOT: Efficient Online POMDP Planning Considering Lower Bound in Action Selection (Student Abstract)

Chenyang Wu; Rui Kong; Guoyu Yang; Xianghan Kong; Zongzhang Zhang; Yang Yu; Dong Li; Wulong Liu

doi:10.1609/aaai.v35i18.17960

Authors

Chenyang Wu National Key Lab for Novel Software Technology, Nanjing University
Rui Kong National Key Lab for Novel Software Technology, Nanjing University
Guoyu Yang National Key Lab for Novel Software Technology, Nanjing University
Xianghan Kong National Key Lab for Novel Software Technology, Nanjing University
Zongzhang Zhang National Key Lab for Novel Software Technology, Nanjing University
Yang Yu National Key Lab for Novel Software Technology, Nanjing University
Dong Li Noah’s Ark Lab, Huawei Company
Wulong Liu Noah’s Ark Lab, Huawei Company

DOI:

https://doi.org/10.1609/aaai.v35i18.17960

Keywords:

Planning Under Uncertainty, POMDP, Online Planning

Abstract

Partially observable Markov decision process (POMDP) is an extension to MDP. It handles the state uncertainty by specifying the probability of getting a particular observation given the current state. DESPOT is one of the most popular scalable online planning algorithms for POMDPs, which manages to significantly reduce the size of the decision tree while deriving a near-optimal policy by considering only $K$ scenarios. Nevertheless, there is a gap in action selection criteria between planning and execution in DESPOT. During the planning stage, it keeps choosing the action with the highest upper bound, whereas when the planning ends, the action with the highest lower bound is chosen for execution. Here, we propose LB-DESPOT to alleviate this issue, which utilizes the lower bound in selecting an action branch to expand. Empirically, our method has attained better performance than DESPOT and POMCP, which is another state-of-the-art, on several challenging POMDP benchmark tasks.

LB-DESPOT: Efficient Online POMDP Planning Considering Lower Bound in Action Selection (Student Abstract)

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription