Diffusion for Combating the Hallucination in Large Language Models (Student Abstract)

Authors

  • Hyojun Ahn Korea University
  • Joongheon Kim Korea University

DOI:

https://doi.org/10.1609/aaai.v40i48.42183

Abstract

Large language models (LLMs) often generate hallucinations—fluent yet factually incorrect responses—that undermine reliability in knowledge-intensive tasks. Existing approaches for hallucination mitigation typically rely on external retrieval modules or probability heuristics, which either require additional resources or lack interpretability. In this work, we propose a diffusion-based hallucination detection framework (DHDF) that leverages U-Net denoising to reconstruct consensus answers from multiple LLM outputs. If the diffusion process exhibits spurious convergence away from factual ground truth, it provides a clear signal of hallucination. To quantify factual correctness, we incorporate TruthfulQA scores as a fact-grounded evaluation metric, distinguishing well-aligned models (high scores) from hallucination-prone models (low scores). Experimental results demonstrate that convergence dynamics under diffusion, combined with fact-grounded QA evaluation, offer an effective and interpretable pathway for hallucination detection without relying on external knowledge bases.

Published

2026-03-14

How to Cite

Ahn, H., & Kim, J. (2026). Diffusion for Combating the Hallucination in Large Language Models (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41119–41120. https://doi.org/10.1609/aaai.v40i48.42183