A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks

Authors

  • Zhe Zhao University of Science and Technology of China City University of Hong Kong
  • Pengkun Wang University of Science and Technology of China
  • Haibin Wen Shaoguan University
  • Yudong Zhang University of Science and Technology of China
  • Zhengyang Zhou University of Science and Technology of China
  • Yang Wang University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v38i15.29648

Keywords:

ML: Transparent, Interpretable, Explainable ML, ML: Deep Learning Algorithms, ML: Deep Learning Theory, ML: Graph-based Machine Learning, ML: Information Theory, ML: Representation Learning, ML: Semi-Supervised Learning, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Graph neural networks (GNNs) have achieved state-of-the-art results on many graph representation learning tasks by exploiting statistical correlations. However, numerous observations have shown that such correlations may not reflect the true causal mechanisms underlying the data and thus may hamper the ability of the model to generalize beyond the observed distribution. To address this problem, we propose an Information-based Causal Learning (ICL) framework that combines information theory and causality to analyze and improve graph representation learning to transform information relevance to causal dependence. Specifically, we first introduce a multi-objective mutual information optimization objective derived from information-theoretic analysis and causal learning principles to simultaneously extract invariant and interpretable causal information and reduce reliance on non-causal information in correlations. To optimize this multi-objective objective, we enable a causal disentanglement layer that effectively decouples the causal and non-causal information in the graph representations. Moreover, due to the intractability of mutual information estimation, we derive variational bounds that enable us to transform the above objective into a tractable loss function. To balance the multiple information objectives and avoid optimization conflicts, we leverage multi-objective gradient descent to achieve a stable and efficient transformation from informational correlation to causal dependency. Our approach provides important insights into modulating the information flow in GNNs to enhance their reliability and generalization. Extensive experiments demonstrate that our approach significantly improves the robustness and interpretability of GNNs across different distribution shifts. Visual analysis demonstrates how our method converts informative dependencies in representations into causal dependencies.

Downloads

Published

2024-03-24

How to Cite

Zhao, Z., Wang, P., Wen, H., Zhang, Y., Zhou, Z., & Wang, Y. (2024). A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 17042-17050. https://doi.org/10.1609/aaai.v38i15.29648

Issue

Section

AAAI Technical Track on Machine Learning VI