A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks
DOI:
https://doi.org/10.1609/aaai.v38i15.29648Keywords:
ML: Transparent, Interpretable, Explainable ML, ML: Deep Learning Algorithms, ML: Deep Learning Theory, ML: Graph-based Machine Learning, ML: Information Theory, ML: Representation Learning, ML: Semi-Supervised Learning, ML: Transfer, Domain Adaptation, Multi-Task LearningAbstract
Graph neural networks (GNNs) have achieved state-of-the-art results on many graph representation learning tasks by exploiting statistical correlations. However, numerous observations have shown that such correlations may not reflect the true causal mechanisms underlying the data and thus may hamper the ability of the model to generalize beyond the observed distribution. To address this problem, we propose an Information-based Causal Learning (ICL) framework that combines information theory and causality to analyze and improve graph representation learning to transform information relevance to causal dependence. Specifically, we first introduce a multi-objective mutual information optimization objective derived from information-theoretic analysis and causal learning principles to simultaneously extract invariant and interpretable causal information and reduce reliance on non-causal information in correlations. To optimize this multi-objective objective, we enable a causal disentanglement layer that effectively decouples the causal and non-causal information in the graph representations. Moreover, due to the intractability of mutual information estimation, we derive variational bounds that enable us to transform the above objective into a tractable loss function. To balance the multiple information objectives and avoid optimization conflicts, we leverage multi-objective gradient descent to achieve a stable and efficient transformation from informational correlation to causal dependency. Our approach provides important insights into modulating the information flow in GNNs to enhance their reliability and generalization. Extensive experiments demonstrate that our approach significantly improves the robustness and interpretability of GNNs across different distribution shifts. Visual analysis demonstrates how our method converts informative dependencies in representations into causal dependencies.Downloads
Published
2024-03-24
How to Cite
Zhao, Z., Wang, P., Wen, H., Zhang, Y., Zhou, Z., & Wang, Y. (2024). A Twist for Graph Classification: Optimizing Causal Information Flow in Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 17042-17050. https://doi.org/10.1609/aaai.v38i15.29648
Issue
Section
AAAI Technical Track on Machine Learning VI