Heterogeneous Graph Masked Autoencoders

Authors

  • Yijun Tian Department of Computer Science and Engineering, University of Notre Dame Lucy Family Institute for Data and Society, University of Notre Dame
  • Kaiwen Dong Department of Computer Science and Engineering, University of Notre Dame Lucy Family Institute for Data and Society, University of Notre Dame
  • Chunhui Zhang Department of Computer Science, Brandeis University
  • Chuxu Zhang Department of Computer Science, Brandeis University
  • Nitesh V. Chawla Department of Computer Science and Engineering, University of Notre Dame Lucy Family Institute for Data and Society, University of Notre Dame

DOI:

https://doi.org/10.1609/aaai.v37i8.26192

Keywords:

ML: Graph-based Machine Learning, DMKM: Graph Mining, Social Network Analysis & Community Mining, ML: Unsupervised & Self-Supervised Learning, ML: Deep Generative Models & Autoencoders

Abstract

Generative self-supervised learning (SSL), especially masked autoencoders, has become one of the most exciting learning paradigms and has shown great potential in handling graph data. However, real-world graphs are always heterogeneous, which poses three critical challenges that existing methods ignore: 1) how to capture complex graph structure? 2) how to incorporate various node attributes? and 3) how to encode different node positions? In light of this, we study the problem of generative SSL on heterogeneous graphs and propose HGMAE, a novel heterogeneous graph masked autoencoder model to address these challenges. HGMAE captures comprehensive graph information via two innovative masking techniques and three unique training strategies. In particular, we first develop metapath masking and adaptive attribute masking with dynamic mask rate to enable effective and stable learning on heterogeneous graphs. We then design several training strategies including metapath-based edge reconstruction to adopt complex structural information, target attribute restoration to incorporate various node attributes, and positional feature prediction to encode node positional information. Extensive experiments demonstrate that HGMAE outperforms both contrastive and generative state-of-the-art baselines on several tasks across multiple datasets. Codes are available at https://github.com/meettyj/HGMAE.

Downloads

Published

2023-06-26

How to Cite

Tian, Y., Dong, K., Zhang, C., Zhang, C., & Chawla, N. V. (2023). Heterogeneous Graph Masked Autoencoders. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8), 9997-10005. https://doi.org/10.1609/aaai.v37i8.26192

Issue

Section

AAAI Technical Track on Machine Learning III