AUPO – Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm

Robin Schmöcker; Alexander Dockhorn; Bodo Rosenhahn

doi:10.1609/icaps.v36i1.42835

AUPO – Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm

Authors

Robin Schmöcker Leibniz Universität Hannover, Institut für Informationsverarbeitung
Alexander Dockhorn SDU Metaverse Lab, University of Southern Denmark
Bodo Rosenhahn Leibniz Universität Hannover, Institut für Informationsverarbeitung

DOI:

https://doi.org/10.1609/icaps.v36i1.42835

Abstract

We introduce a novel, drop-in modification to Monte Carlo Tree Search's (MCTS) decision policy that we call AUPO. Comparisons based on a range of IPPC benchmark problems show that AUPO outperforms vanilla MCTS in domains with dense rewards and value-equivalent sibling actions under finite iteration budgets. AUPO is an automatic action abstraction algorithm that solely relies on reward distribution statistics acquired during the MCTS. Thus, unlike other automatic abstraction algorithms, AUPO requires neither access to transition probabilities nor does AUPO require a directed acyclic search graph to build its abstraction, allowing AUPO to detect symmetric actions that state-of-the-art frameworks like ASAP struggle with when the resulting symmetric states are far apart in state space. Furthermore, as AUPO only affects the decision policy, it is not mutually exclusive with other abstraction techniques that only affect the tree search.

Downloads

Published

2026-06-08

How to Cite

Schmöcker, R., Dockhorn, A., & Rosenhahn, B. (2026). AUPO – Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm. Proceedings of the International Conference on Automated Planning and Scheduling, 36(1), 256–265. https://doi.org/10.1609/icaps.v36i1.42835

Download Citation

Issue

Vol. 36 No. 1: Proceedings of the Thirty-Sixth International Conference on Automated Planning and Scheduling

Section

Main Track

AUPO – Abstracted Until Proven Otherwise: A Reward Distribution Based Abstraction Algorithm

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information