Symbolic Dynamic Programming for Continuous State and Action MDPs

Authors

  • Zahra Zamani ANU - NICTA The Australian National University National ICT of Australia
  • Scott Sanner NICTA and ANU
  • Cheng Fang Department of Aeronautics and Astronautics, MIT

DOI:

https://doi.org/10.1609/aaai.v26i1.8372

Keywords:

MDP, Symbolic Dynamic Programming, Continuous state-action, Machine Learning, AI

Abstract

Many real-world decision-theoretic planning problemsare naturally modeled using both continuous state andaction (CSA) spaces, yet little work has provided ex-act solutions for the case of continuous actions. Inthis work, we propose a symbolic dynamic program-ming (SDP) solution to obtain the optimal closed-formvalue function and policy for CSA-MDPs with mul-tivariate continuous state and actions, discrete noise,piecewise linear dynamics, and piecewise linear (or re-stricted piecewise quadratic) reward. Our key contribu-tion over previous SDP work is to show how the contin-uous action maximization step in the dynamic program-ming backup can be evaluated optimally and symboli-cally — a task which amounts to symbolic constrainedoptimization subject to unknown state parameters; wefurther integrate this technique to work with an efficientand compact data structure for SDP — the extendedalgebraic decision diagram (XADD). We demonstrateempirical results on a didactic nonlinear planning exam-ple and two domains from operations research to showthe first automated exact solution to these problems.

Downloads

Published

2021-09-20

How to Cite

Zamani, Z., Sanner, S., & Fang, C. (2021). Symbolic Dynamic Programming for Continuous State and Action MDPs. Proceedings of the AAAI Conference on Artificial Intelligence, 26(1), 1839-1845. https://doi.org/10.1609/aaai.v26i1.8372

Issue

Section

Reasoning about Plans, Processes and Actions