HFR-MKGC: Hierarchical Fusion Reasoning with MLLMs for Multi-modal Knowledge Graph Completion

Authors

  • Di Wang Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Junping Du Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Zhe Xue Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Meiyu Liang Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Guanhua Ye Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Yingxia Shao Beijing University of Posts and Telecommunications Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia
  • Haisheng Li Beijing Technology and Business University

DOI:

https://doi.org/10.1609/aaai.v40i18.38613

Abstract

Multi-modal knowledge graph completion (MMKGC) aims to infer missing entities of triples by leveraging heterogeneous information in knowledge graph (KG). However, existing approaches often struggle with inconsistent modality alignment, limited reasoning depth, and insufficient negative sample quality. In this work, we propose HFR-MKGC, a novel framework that integrates hierarchical modal fusion and Multimodal Large Language Model (MLLM) reasoning for robust and expressive MMKGC. Specifically, we introduce a relation-guided hierarchical modal fusion module, which conducts fine-grained intra-visual fusion and relation-guided cross-modal integration to yield rich entity representations. HFR-MKGC employs a fine-tuned MLLM to perform instruction-based triple reasoning, producing candidate entities for completion. Then, it constructs hard negative samples through textual perturbation by MLLM and visual feature augmentation with rotation and noise. HFR-MKGC optimizes the model via adversarial training. Extensive experiments on three MMKGC benchmarks demonstrate that our method outperforms state-of-the-art methods, validating its effectiveness in MMKGC.

Downloads

Published

2026-03-14

How to Cite

Wang, D., Du, J., Xue, Z., Liang, M., Ye, G., Shao, Y., & Li, H. (2026). HFR-MKGC: Hierarchical Fusion Reasoning with MLLMs for Multi-modal Knowledge Graph Completion. Proceedings of the AAAI Conference on Artificial Intelligence, 40(18), 15815–15823. https://doi.org/10.1609/aaai.v40i18.38613

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II