SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio

Authors

  • Tong Lu School of Artificial Intelligence, Beijing Normal University
  • Zhichun Wang School of Artificial Intelligence, Beijing Normal University Engineering Research Center of Intelligent Technology and Educational Application, Ministry of Education
  • Yaoyu Zhou School of Artificial Intelligence, Beijing Normal University
  • Yiming Guan School of Artificial Intelligence, Beijing Normal University
  • Zhiyong Bai Faculty of Education, Beijing Normal University
  • Junsheng Du School of intelligent system engineering, Sun Yat-sen University

DOI:

https://doi.org/10.1609/aaai.v40i18.38574

Abstract

Knowledge graphs (KGs) play a vital role in intelligent education by offering structured representations of educational content. However, constructing multimodal educational knowledge graphs (EKGs) from diverse open educational resources remains a challenge due to the reliance on costly manual annotations and the lack of multimodal integration. In this work, we propose an automated framework that harnesses the reasoning capabilities of large language models (LLMs) to construct multimodal EKGs from open courses efficiently. In our framework, an Extraction-Verification-Integration-Augmentation pipeline is designed to incrementally extract and refine disciplinary concepts from learning resources. Texts, images, videos and audios are aligned with their corresponding concepts. To ensure semantic consistency across modalities, we propose a cross-modal alignment method based on shared structural and semantic features. Using our framework, we build SciMKG, a large-scale multimodal EKG for Chinese K12 education in sciences (biology, physics, and chemistry), encompassing 1,356 knowledge points, 34,630 multimodal concepts, and 403,400 relational triples. Experimental results show that our method improves concept extraction F1 score by 9 % over state-of-the-art baselines; both automatic and human evaluations confirm the robustness of our multimodal alignment method. SciMKG and our construction toolkit will be publicly released to support further research and applications in AI-driven education.

Downloads

Published

2026-03-14

How to Cite

Lu, T., Wang, Z., Zhou, Y., Guan, Y., Bai, Z., & Du, J. (2026). SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio. Proceedings of the AAAI Conference on Artificial Intelligence, 40(18), 15466–15474. https://doi.org/10.1609/aaai.v40i18.38574

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II