On Shallow Planning Under Partial Observability
DOI:
https://doi.org/10.1609/aaai.v39i25.34860Abstract
Formulating a real-world problem under the Reinforcement Learning framework involves non-trivial design choices, such as selecting a discount factor for the learning objective (dis- counted cumulative rewards), which articulates the planning horizon of the agent. This work investigates the impact of the discount factor on the bias-variance trade-off given structural parameters of the underlying Markov Decision Process. Our results support the idea that a shorter planning horizon might be beneficial, especially under partial observability.Downloads
Published
2025-04-11
How to Cite
Lefebvre, R., & Durand, A. (2025). On Shallow Planning Under Partial Observability. Proceedings of the AAAI Conference on Artificial Intelligence, 39(25), 26587–26595. https://doi.org/10.1609/aaai.v39i25.34860
Issue
Section
AAAI Technical Track on Planning, Routing, and Scheduling