Semantic Verification in Large Language Model-based Retrieval Augmented Generation


  • Andreas Martin FHNW University of Applied Sciences Northwestern Switzerland
  • Hans Friedrich Witschel FHNW University of Applied Sciences Northwestern Switzerland
  • Maximilian Mandl Nagra
  • Mona Stockhecke Nagra



Semantic Verification, Large Language Models (LLMs), Retrieval Augmented Generation (RAG), Fact-checking, Swiss Direct Democracy, Hybrid Dialogue System


This position paper presents a novel approach of semantic verification in Large Language Model-based Retrieval Augmented Generation (LLM-RAG) systems, focusing on the critical need for factually accurate information dissemination during public debates, especially prior to plebiscites e.g. in direct democracies, particularly in the context of Switzerland. Recognizing the unique challenges posed by the current generation of Large Language Models (LLMs) in maintaining factual integrity, this research proposes an innovative solution that integrates retrieval mechanisms with enhanced semantic verification processes. The paper outlines a comprehensive methodology following a Design Science Research approach, which includes defining user personas, designing conversational interfaces, and iteratively developing a hybrid dialogue system. Central to this system is a robust semantic verification framework that leverages a knowledge graph for fact-checking and validation, ensuring the correctness and consistency of information generated by LLMs. The paper discusses the significance of this research in the context of Swiss direct democracy, where informed decision-making is pivotal. By improving the accuracy and reliability of information provided to the public, the proposed system aims to support the democratic process, enabling citizens to make well-informed decisions on complex issues. The research contributes to advancing the field of natural language processing and information retrieval, demonstrating the potential of AI and LLMs in enhancing civic engagement and democratic participation.






Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge