Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding

Authors

  • Jingping Liu East China University of Science and Technology
  • Mingchuan Zhang Fudan University
  • Weichen Li Fudan University
  • Chao Wang Shanghai University
  • Shuang Li Fudan University
  • Haiyun Jiang Tencent AI Lab
  • Sihang Jiang Fudan University
  • Yanghua Xiao Fudan University
  • Yunwen Chen DataGrand Inc.

DOI:

https://doi.org/10.1609/aaai.v38i17.29828

Keywords:

NLP: Language Grounding & Multi-modal NLP, KRR: Knowledge Acquisition

Abstract

Much effort has been devoted to building multi-modal knowledge graphs by visualizing entities on images, but ignoring the multi-modal information of the relation between entities. Hence, in this paper, we aim to construct a new large-scale multi-modal knowledge graph with triplet facts grounded on images that reflect not only entities but also their relations. To achieve this purpose, we propose a novel pipeline method, including triplet fact filtering, image retrieving, entity-based image filtering, relation-based image filtering, and image clustering. In this way, a multi-modal knowledge graph named ImgFact is constructed, which contains 247,732 triplet facts and 3,730,805 images. In experiments, the manual and automatic evaluations prove the reliable quality of our ImgFact. We further use the obtained images to enhance model performance on two tasks. In particular, the model optimized by our ImgFact achieves an impressive 8.38% and 9.87% improvement over the solutions enhanced by an existing multi-modal knowledge graph and VisualChatGPT on F1 of relation classification. We release ImgFact and its instructions at https://github.com/kleinercubs/ImgFact.

Published

2024-03-24

How to Cite

Liu, J., Zhang, M., Li, W., Wang, C., Li, S., Jiang, H., Jiang, S., Xiao, Y., & Chen, Y. (2024). Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 18653-18661. https://doi.org/10.1609/aaai.v38i17.29828

Issue

Section

AAAI Technical Track on Natural Language Processing II