Efficient Decision-Theoretic Target Localization

Louis Dressel; Mykel Kochenderfer

doi:10.1609/icaps.v27i1.13832

Authors

Louis Dressel Stanford University
Mykel Kochenderfer Stanford University

DOI:

https://doi.org/10.1609/icaps.v27i1.13832

Abstract

Partially observable Markov decision processes (POMDPs) offer a principled approach to control under uncertainty. However, POMDP solvers generally require rewards to depend only on the state and action. This limitation is unsuitable for information-gathering problems, where rewards are more naturally expressed as functions of belief. In this work, we consider target localization, an information-gathering task where an agent takes actions leading to informative observations and a concentrated belief over possible target locations. By leveraging recent theoretical and algorithmic advances, we investigate offline and online solvers that incorporate belief-dependent rewards. We extend SARSOP — a state-of-the-art offline solver — to handle belief-dependent rewards, exploring different reward strategies and showing how they can be compactly represented. We present an improved lower bound that greatly speeds convergence. POMDP-lite, an online solver, is also evaluated in the context of information-gathering tasks. These solvers are applied to control a hexcopter UAV searching for a radio frequency source—a challenging real-world problem.

Efficient Decision-Theoretic Target Localization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information