Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Authors

  • Wenyi Xiao School of Software Technology, Zhejiang University
  • Ziwei Huang School of Software Technology, Zhejiang University
  • Leilei Gan School of Software Technology, Zhejiang University
  • Wanggui He Alibaba Group
  • Haoyuan Li Alibaba Group
  • Zhelun Yu Alibaba Group
  • Fangxun Shu Alibaba Group
  • Hao Jiang Alibaba Group
  • Linchao Zhu College of Computer Science and Technology, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v39i24.34744

Abstract

The rapidly developing Large Vision Language Models (LVLMs) still face the hallucination phenomena where the generated responses do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g., labeling by human experts or proprietary models). To address these issues, we propose detecting and mitigating hallucinations in LVLMs via fine-grained AI feedback. The basic idea is that we generate a small-size sentence-level hallucination annotation dataset by proprietary models, whereby we train a detection model which can perform sentence-level hallucination detection. Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for hallucination mitigation training. Furthermore, we propose differentiating the severity of hallucinations, and introducing a Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) which prioritizes the mitigation of critical hallucination in LVLMs by incorporating the severity of hallucinations into preference learning. Extensive experiments on hallucination detection and mitigation benchmarks demonstrate that our method sets a new state-of-the-art in hallucination detection on MHaluBench, surpassing GPT-4V and Gemini, and reduces the hallucination rate by 36.1% on AMBER and 76.3% on Object HalBench compared to the base model.

Published

2025-04-11

How to Cite

Xiao, W., Huang, Z., Gan, L., He, W., Li, H., Yu, Z., … Zhu, L. (2025). Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25543–25551. https://doi.org/10.1609/aaai.v39i24.34744

Issue

Section

AAAI Technical Track on Natural Language Processing III