Inverse Abstraction of Neural Networks Using Symbolic Interpolation

Authors

  • Sumanth Dathathri California Institute of Technology
  • Sicun Gao University of California, San Diego
  • Richard M. Murray California Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v33i01.33013437

Abstract

Neural networks in real-world applications have to satisfy critical properties such as safety and reliability. The analysis of such properties typically requires extracting information through computing pre-images of the network transformations, but it is well-known that explicit computation of pre-images is intractable. We introduce new methods for computing compact symbolic abstractions of pre-images by computing their overapproximations and underapproximations through all layers. The abstraction of pre-images enables formal analysis and knowledge extraction without affecting standard learning algorithms. We use inverse abstractions to automatically extract simple control laws and compact representations for pre-images corresponding to unsafe outputs. We illustrate that the extracted abstractions are interpretable and can be used for analyzing complex properties.

Downloads

Published

2019-07-17

How to Cite

Dathathri, S., Gao, S., & Murray, R. M. (2019). Inverse Abstraction of Neural Networks Using Symbolic Interpolation. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 3437-3444. https://doi.org/10.1609/aaai.v33i01.33013437

Issue

Section

AAAI Technical Track: Machine Learning