A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks

Zhe Zhao; Pengkun Wang; Haibin Wen; Yudong Zhang; Zhengyang Zhou; Yang Wang

doi:10.1609/aaai.v38i15.29648

Authors

Zhe Zhao University of Science and Technology of China City University of Hong Kong
Pengkun Wang University of Science and Technology of China
Haibin Wen Shaoguan University
Yudong Zhang University of Science and Technology of China
Zhengyang Zhou University of Science and Technology of China
Yang Wang University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v38i15.29648

Keywords:

ML: Transparent, Interpretable, Explainable ML, ML: Deep Learning Algorithms, ML: Deep Learning Theory, ML: Graph-based Machine Learning, ML: Information Theory, ML: Representation Learning, ML: Semi-Supervised Learning, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Graph neural networks (GNNs) have achieved state-of-the-art results on many graph representation learning tasks by exploiting statistical correlations. However, numerous observations have shown that such correlations may not reflect the true causal mechanisms underlying the data and thus may hamper the ability of the model to generalize beyond the observed distribution. To address this problem, we propose an Information-based Causal Learning (ICL) framework that combines information theory and causality to analyze and improve graph representation learning to transform information relevance to causal dependence. Specifically, we first introduce a multi-objective mutual information optimization objective derived from information-theoretic analysis and causal learning principles to simultaneously extract invariant and interpretable causal information and reduce reliance on non-causal information in correlations. To optimize this multi-objective objective, we enable a causal disentanglement layer that effectively decouples the causal and non-causal information in the graph representations. Moreover, due to the intractability of mutual information estimation, we derive variational bounds that enable us to transform the above objective into a tractable loss function. To balance the multiple information objectives and avoid optimization conflicts, we leverage multi-objective gradient descent to achieve a stable and efficient transformation from informational correlation to causal dependency. Our approach provides important insights into modulating the information flow in GNNs to enhance their reliability and generalization. Extensive experiments demonstrate that our approach significantly improves the robustness and interpretability of GNNs across different distribution shifts. Visual analysis demonstrates how our method converts informative dependencies in representations into causal dependencies.

A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription