Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Wenyi Xiao; Ziwei Huang; Leilei Gan; Wanggui He; Haoyuan Li; Zhelun Yu; Fangxun Shu; Hao Jiang; Linchao Zhu

doi:10.1609/aaai.v39i24.34744

Authors

Wenyi Xiao School of Software Technology, Zhejiang University
Ziwei Huang School of Software Technology, Zhejiang University
Leilei Gan School of Software Technology, Zhejiang University
Wanggui He Alibaba Group
Haoyuan Li Alibaba Group
Zhelun Yu Alibaba Group
Fangxun Shu Alibaba Group
Hao Jiang Alibaba Group
Linchao Zhu College of Computer Science and Technology, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v39i24.34744

Abstract

The rapidly developing Large Vision Language Models (LVLMs) still face the hallucination phenomena where the generated responses do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g., labeling by human experts or proprietary models). To address these issues, we propose detecting and mitigating hallucinations in LVLMs via fine-grained AI feedback. The basic idea is that we generate a small-size sentence-level hallucination annotation dataset by proprietary models, whereby we train a detection model which can perform sentence-level hallucination detection. Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for hallucination mitigation training. Furthermore, we propose differentiating the severity of hallucinations, and introducing a Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) which prioritizes the mitigation of critical hallucination in LVLMs by incorporating the severity of hallucinations into preference learning. Extensive experiments on hallucination detection and mitigation benchmarks demonstrate that our method sets a new state-of-the-art in hallucination detection on MHaluBench, surpassing GPT-4V and Gemini, and reduces the hallucination rate by 36.1% on AMBER and 76.3% on Object HalBench compared to the base model.

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information