A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs

Mateo Perez; Fabio Somenzi; Ashutosh Trivedi

doi:10.1609/aaai.v38i19.30148

A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs

Authors

Mateo Perez University of Colorado Boulder
Fabio Somenzi University of Colorado Boulder
Ashutosh Trivedi University of Colorado Boulder

DOI:

https://doi.org/10.1609/aaai.v38i19.30148

Keywords:

General

Abstract

Linear temporal logic (LTL) and omega-regular objectives---a superset of LTL---have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning algorithm for omega-regular objectives in Markov decision processes (MDPs). As part of the development of our algorithm, we introduce the epsilon-recurrence time: a measure of the speed at which a policy converges to the satisfaction of the omega-regular objective in the limit. We prove that our algorithm only requires a polynomial number of samples in the relevant parameters, and perform experiments which confirm our theory.

AAAI-24 / IAAI-24 / EAAI-24 Proceedings Cover

Downloads

Published

2024-03-24

How to Cite

Perez, M., Somenzi, F., & Trivedi, A. (2024). A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21510–21517. https://doi.org/10.1609/aaai.v38i19.30148

Download Citation

Issue

Vol. 38 No. 19: AAAI-24 Special Track Safe, Robust and Responsible AI Track

Section

AAAI Technical Track on Safe, Robust and Responsible AI Track

A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information