Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models

Zehao Wang; Xinpeng Liu; Yudonglin Zhang; Xiaoqian Wu; Zhou Fang; Yifan Fang; Junfu Pu; Cewu Lu; Yong-Lu Li

doi:10.1609/aaai.v40i12.38005

Authors

Zehao Wang Shanghai Jiao Tong University
Xinpeng Liu Shanghai Jiao Tong University Shanghai Innovation Institute
Yudonglin Zhang Shanghai Jiao Tong University
Xiaoqian Wu Shanghai Jiao Tong University
Zhou Fang Shanghai Jiao Tong University
Yifan Fang Shanghai Jiao Tong University
Junfu Pu ARC Lab, Tencent PCG
Cewu Lu Shanghai Jiao Tong University Shanghai Innovation Institute
Yong-Lu Li Shanghai Jiao Tong University Shanghai Innovation Institute

DOI:

https://doi.org/10.1609/aaai.v40i12.38005

Abstract

Multimodal Large Language Models (MLLMs) have garnered significant attention recently and demonstrate outstanding capabilities in various tasks such as OCR, VQA, captioning, etc. However, hallucination remains a persistent issue. While numerous methods have been proposed to mitigate hallucinations, achieving notable improvements, these methods primarily focus on mitigating hallucinations related to object/noun concepts. Verb concepts, which are crucial for understanding human actions, have been largely overlooked. In this paper, to the best of our knowledge, we are the first to investigate the verb hallucination phenomenon of MLLMs from various perspectives. Our findings reveal that most state-of-the-art MLLMs suffer from severe verb hallucination. To assess the effectiveness of existing mitigation methods for object concept hallucination in relation to verb hallucination, we evaluated these methods and found that they do not effectively address verb hallucination. To address this issue, we propose a baseline method based on fine-tuning with rich verb knowledge, achieving decent superiority. The experiment results demonstrate that our method significantly reduces hallucinations related to verbs.

Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information