Multi-Scale Distillation from Multiple Graph Neural Networks

Authors

  • Chunhai Zhang College Of Artificial Intelligence, Nankai University, Tianjin, China
  • Jie Liu College Of Artificial Intelligence, Nankai University, Tianjin, China Cloopen AI Research, Beijing, China
  • Kai Dang College Of Artificial Intelligence, Nankai University, Tianjin, China
  • Wenzheng Zhang College Of Artificial Intelligence, Nankai University, Tianjin, China

DOI:

https://doi.org/10.1609/aaai.v36i4.20354

Keywords:

Data Mining & Knowledge Management (DMKM), Knowledge Representation And Reasoning (KRR), Machine Learning (ML)

Abstract

Knowledge Distillation (KD), which is an effective model compression and acceleration technique, has been successfully applied to graph neural networks (GNNs) recently. Existing approaches utilize a single GNN model as the teacher to distill knowledge. However, we notice that GNN models with different number of layers demonstrate different classification abilities on nodes with different degrees. On the one hand, for nodes with high degrees, their local structures are dense and complex, hence more message passing is needed. Therefore, GNN models with more layers perform better. On the other hand, for nodes with low degrees, whose local structures are relatively sparse and simple, the repeated message passing can easily lead to over-smoothing. Thus, GNN models with less layers are more suitable. However, existing single-teacher GNN knowledge distillation approaches which are based on a single GNN model, are sub-optimal. To this end, we propose a novel approach to distill multi-scale knowledge, which learns from multiple GNN teacher models with different number of layers to capture the topological semantic at different scales. Instead of learning from the teacher models equally, the proposed method automatically assigns proper weights for each teacher model via an attention mechanism which enables the student to select teachers for different local structures. Extensive experiments are conducted to evaluate the proposed method on four public datasets. The experimental results demonstrate the superiority of our proposed method over state-of-the-art methods. Our code is publicly available at https://github.com/NKU-IIPLab/MSKD.

Downloads

Published

2022-06-28

How to Cite

Zhang, C., Liu, J., Dang, K., & Zhang, W. (2022). Multi-Scale Distillation from Multiple Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 4337-4344. https://doi.org/10.1609/aaai.v36i4.20354

Issue

Section

AAAI Technical Track on Data Mining and Knowledge Management