Generating Ontology-Learning Training-Data through Verbalization

Authors

  • Antonio Zaitoun University of Haifa
  • Tomer Sagi Aalborg University
  • Mor Peleg University of Haifa

DOI:

https://doi.org/10.1609/aaaiss.v4i1.31797

Abstract

Ontologies play an important role in the organization and representation of knowledge. However, in most cases, ontologies do not fully cover domain knowledge, resulting in a gap. This gap, often expressed as a lack of concepts, relations, or axioms, is usually filled by domain experts in a manual and tedious process. Utilizing large language models (LLMs) can ease this process; a fine-tuned LLM could receive as input up-to-date and reliable domain knowledge natural text and output a structured graph in OWL RDF/Turtle format, which is the standard format of ontologies. Thus, to fine-tune a model, text-owl sentence pairs that constitute such a dataset must be acquired. Unfortunately, such a dataset does not exist in the literature or within the open-source community. Therefore, this paper introduces our LLM-assisted verbalizer to create such a data set by converting OWL statements from existing ontologies into natural text. We evaluate the verbalizer on 322 classes from four different ontologies using two different LLMs, achieving precision and recall as high as 0.99 and 0.96, respectively.

Downloads

Published

2024-11-08

How to Cite

Zaitoun, A., Sagi, T., & Peleg, M. (2024). Generating Ontology-Learning Training-Data through Verbalization. Proceedings of the AAAI Symposium Series, 4(1), 233-241. https://doi.org/10.1609/aaaiss.v4i1.31797

Issue

Section

Large Language Models for Knowledge Graph and Ontology Engineering