Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer

Authors

  • Jinyu Yang Beijing University of Posts and Telecommunications
  • Ruijia Wang China Telecom Cloud Computing Research Institute
  • Cheng Yang Beijing University of Posts and Telecommunications
  • Bo Yan Beijing University of Posts and Telecommunications
  • Qimin Zhou Beijing University of Posts and Telecommunications
  • Yang Juan Beijing University of Posts and Telecommunications
  • Chuan Shi Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v39i12.33421

Abstract

Heterogeneous graphs (HGs) that contain various node and edge types are ubiquitous in real-world scenarios. Considering the common label sparsity problem in HGs, some researchers propose to pretrain on source HGs to extract general knowledge and then fine-tune on a target HG for knowledge transfer. However, existing methods often assume that source and target HGs share a single heterogeneity, meaning that they have the same types of nodes and edges, which contradicts the real-world scenarios requiring cross-heterogeneity transfer. Although a recent study has made some preliminary attempts in cross-heterogeneity learning, its definition of general knowledge heavily rely on human knowledge, which lacks flexibility and further leads to a suboptimal transfer. To address the problem, we propose a novel Language Model-enhanced Cross-Heterogeneity learning model, namely LMCH. Specifically, we first design a metapath-based corpus construction method to unify HG representations as languages. The corpora of source HGs are then used to fine-tune a pretrained Language Model (LM), enabling the LM to autonomously extract general knowledge across different HGs. Furthermore, to fully utilize the extensive unlabeled nodes in a few-labeled target HG, we propose an iterative training pipeline with the help of an extra Graph Neural Network (GNN) predictor, enhanced by LM-GNN contrastive alignment at the end of each iteration. Extensive experiments on four real-world datasets have demonstrated the superior performance of LMCH over state-of-the-art methods.

Downloads

Published

2025-04-11

How to Cite

Yang, J., Wang, R., Yang, C., Yan, B., Zhou, Q., Juan, Y., & Shi, C. (2025). Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 39(12), 13026–13034. https://doi.org/10.1609/aaai.v39i12.33421

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II