Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

Nathan Fulton; André Platzer

doi:10.1609/aaai.v32i1.12107

Authors

Nathan Fulton Carnegie Mellon University
André Platzer Carnegie Mellon University

DOI:

https://doi.org/10.1609/aaai.v32i1.12107

Keywords:

Formal Methods, Software Verification, Safe Reinforcement Learning, AI Safety, Hybrid Systems, Cyber-Physical Systems

Abstract

Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, reinforcement learning-based controllers are lauded for their flexibility in unmodeled environments, but do not provide guarantees of safe operation. This paper presents an approach for provably safe learning that provides the best of both worlds: the exploration and optimization capabilities of learning along with the safety guarantees of formal verification. Our main insight is that formal verification combined with verified runtime monitoring can ensure the safety of a learning agent. Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification. When a model violation is detected, the agent abandons efficiency and instead attempts to learn a control strategy that guides the agent to a modeled portion of the state space. We prove that our approach toward incorporating knowledge about safe control into learning systems preserves safety guarantees, and demonstrate that we retain the empirical performance benefits provided by reinforcement learning. We also explore various points in the design space for these justified speculative controllers in a simple model of adaptive cruise control model for autonomous cars.

Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information