Generating Ontology-Learning Training-Data through Verbalization

Antonio Zaitoun; Tomer Sagi; Mor Peleg

doi:10.1609/aaaiss.v4i1.31797

Authors

Antonio Zaitoun University of Haifa
Tomer Sagi Aalborg University
Mor Peleg University of Haifa

DOI:

https://doi.org/10.1609/aaaiss.v4i1.31797

Abstract

Ontologies play an important role in the organization and representation of knowledge. However, in most cases, ontologies do not fully cover domain knowledge, resulting in a gap. This gap, often expressed as a lack of concepts, relations, or axioms, is usually filled by domain experts in a manual and tedious process. Utilizing large language models (LLMs) can ease this process; a fine-tuned LLM could receive as input up-to-date and reliable domain knowledge natural text and output a structured graph in OWL RDF/Turtle format, which is the standard format of ontologies. Thus, to fine-tune a model, text-owl sentence pairs that constitute such a dataset must be acquired. Unfortunately, such a dataset does not exist in the literature or within the open-source community. Therefore, this paper introduces our LLM-assisted verbalizer to create such a data set by converting OWL statements from existing ontologies into natural text. We evaluate the verbalizer on 322 classes from four different ontologies using two different LLMs, achieving precision and recall as high as 0.99 and 0.96, respectively.

Generating Ontology-Learning Training-Data through Verbalization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information