A Theory of Independent Mechanisms for Extrapolation in Generative Models

Authors

  • Michel Besserve Max Planck Institute for Intelligent Systems, Tübingen, Germany Max Planck Institute for Biological Cybernetics, Tübingen, Germany
  • Remy Sun ENS Rennes, France
  • Dominik Janzing Max Planck Institute for Intelligent Systems, Tübingen, Germany
  • Bernhard Schölkopf Max Planck Institute for Intelligent Systems, Tübingen, Germany

DOI:

https://doi.org/10.1609/aaai.v35i8.16833

Keywords:

Causal Learning, Unsupervised & Self-Supervised Learning

Abstract

Generative models can be trained to emulate complex empirical data, but are they useful to make predictions in the context of previously unobserved environments? An intuitive idea to promote such extrapolation capabilities is to have the architecture of such model reflect a causal graph of the true data generating process, such that one can intervene on each node independently of the others. However, the nodes of this graph are usually unobserved, leading to overparameterization and lack of identifiability of the causal structure. We develop a theoretical framework to address this challenging situation by defining a weaker form of identifiability, based on the principle of independence of mechanisms. We demonstrate on toy examples that classical stochastic gradient descent can hinder the model's extrapolation capabilities, suggesting independence of mechanisms should be enforced explicitly during training. Experiments on deep generative models trained on real world data support these insights and illustrate how the extrapolation capabilities of such models can be leveraged.

Downloads

Published

2021-05-18

How to Cite

Besserve, M., Sun, R., Janzing, D., & Schölkopf, B. (2021). A Theory of Independent Mechanisms for Extrapolation in Generative Models. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6741-6749. https://doi.org/10.1609/aaai.v35i8.16833

Issue

Section

AAAI Technical Track on Machine Learning I