Diffusion for Combating the Hallucination in Large Language Models (Student Abstract)
DOI:
https://doi.org/10.1609/aaai.v40i48.42183Abstract
Large language models (LLMs) often generate hallucinations—fluent yet factually incorrect responses—that undermine reliability in knowledge-intensive tasks. Existing approaches for hallucination mitigation typically rely on external retrieval modules or probability heuristics, which either require additional resources or lack interpretability. In this work, we propose a diffusion-based hallucination detection framework (DHDF) that leverages U-Net denoising to reconstruct consensus answers from multiple LLM outputs. If the diffusion process exhibits spurious convergence away from factual ground truth, it provides a clear signal of hallucination. To quantify factual correctness, we incorporate TruthfulQA scores as a fact-grounded evaluation metric, distinguishing well-aligned models (high scores) from hallucination-prone models (low scores). Experimental results demonstrate that convergence dynamics under diffusion, combined with fact-grounded QA evaluation, offer an effective and interpretable pathway for hallucination detection without relying on external knowledge bases.Downloads
Published
2026-03-14
How to Cite
Ahn, H., & Kim, J. (2026). Diffusion for Combating the Hallucination in Large Language Models (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41119–41120. https://doi.org/10.1609/aaai.v40i48.42183
Issue
Section
AAAI Student Abstract and Poster Program