Boosting Graph Neural Networks via Adaptive Knowledge Distillation
DOI:
https://doi.org/10.1609/aaai.v37i6.25944Keywords:
ML: Graph-based Machine Learning, DMKM: Graph Mining, Social Network Analysis & Community MiningAbstract
Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks. While sharing the same message passing framework, our study shows that different GNNs learn distinct knowledge from the same graph. This implies potential performance improvement by distilling the complementary knowledge from multiple models. However, knowledge distillation (KD) transfers knowledge from high-capacity teachers to a lightweight student, which deviates from our scenario: GNNs are often shallow. To transfer knowledge effectively, we need to tackle two challenges: how to transfer knowledge from compact teachers to a student with the same capacity; and, how to exploit student GNN's own learning ability. In this paper, we propose a novel adaptive KD framework, called BGNN, which sequentially transfers knowledge from multiple GNNs into a student GNN. We also introduce an adaptive temperature module and a weight boosting module. These modules guide the student to the appropriate knowledge for effective learning. Extensive experiments have demonstrated the effectiveness of BGNN. In particular, we achieve up to 3.05% improvement for node classification and 6.35% improvement for graph classification over vanilla GNNs.Downloads
Published
2023-06-26
How to Cite
Guo, Z., Zhang, C., Fan, Y., Tian, Y., Zhang, C., & Chawla, N. V. (2023). Boosting Graph Neural Networks via Adaptive Knowledge Distillation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6), 7793-7801. https://doi.org/10.1609/aaai.v37i6.25944
Issue
Section
AAAI Technical Track on Machine Learning I