Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer

Jinyu Yang; Ruijia Wang; Cheng Yang; Bo Yan; Qimin Zhou; Yang Juan; Chuan Shi

doi:10.1609/aaai.v39i12.33421

Authors

Jinyu Yang Beijing University of Posts and Telecommunications
Ruijia Wang China Telecom Cloud Computing Research Institute
Cheng Yang Beijing University of Posts and Telecommunications
Bo Yan Beijing University of Posts and Telecommunications
Qimin Zhou Beijing University of Posts and Telecommunications
Yang Juan Beijing University of Posts and Telecommunications
Chuan Shi Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v39i12.33421

Abstract

Heterogeneous graphs (HGs) that contain various node and edge types are ubiquitous in real-world scenarios. Considering the common label sparsity problem in HGs, some researchers propose to pretrain on source HGs to extract general knowledge and then fine-tune on a target HG for knowledge transfer. However, existing methods often assume that source and target HGs share a single heterogeneity, meaning that they have the same types of nodes and edges, which contradicts the real-world scenarios requiring cross-heterogeneity transfer. Although a recent study has made some preliminary attempts in cross-heterogeneity learning, its definition of general knowledge heavily rely on human knowledge, which lacks flexibility and further leads to a suboptimal transfer. To address the problem, we propose a novel Language Model-enhanced Cross-Heterogeneity learning model, namely LMCH. Specifically, we first design a metapath-based corpus construction method to unify HG representations as languages. The corpora of source HGs are then used to fine-tune a pretrained Language Model (LM), enabling the LM to autonomously extract general knowledge across different HGs. Furthermore, to fully utilize the extensive unlabeled nodes in a few-labeled target HG, we propose an iterative training pipeline with the help of an extra Graph Neural Network (GNN) predictor, enhanced by LM-GNN contrastive alignment at the end of each iteration. Extensive experiments on four real-world datasets have demonstrated the superior performance of LMCH over state-of-the-art methods.

Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information