Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Sumanth Dathathri; Sicun Gao; Richard M. Murray

doi:10.1609/aaai.v33i01.33013437

Authors

Sumanth Dathathri California Institute of Technology
Sicun Gao University of California, San Diego
Richard M. Murray California Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v33i01.33013437

Abstract

Neural networks in real-world applications have to satisfy critical properties such as safety and reliability. The analysis of such properties typically requires extracting information through computing pre-images of the network transformations, but it is well-known that explicit computation of pre-images is intractable. We introduce new methods for computing compact symbolic abstractions of pre-images by computing their overapproximations and underapproximations through all layers. The abstraction of pre-images enables formal analysis and knowledge extraction without affecting standard learning algorithms. We use inverse abstractions to automatically extract simple control laws and compact representations for pre-images corresponding to unsafe outputs. We illustrate that the extracted abstractions are interpretable and can be used for analyzing complex properties.

Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription