MoLE:Decoding by Mixture of Layer Experts Alleviates Hallucination in Large Vision-Language Models

Authors

  • Tian Liang Zhejiang University
  • Yuetian Du Zhejiang University
  • Jing Huang Zhejiang University
  • Ming Kong Zhejiang University
  • Luyuan Chen Beijing Information Science and Technology University
  • Yadong Li Ant Group
  • Siye Chen Ant Group
  • Qiang Zhu Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v39i18.34056

Abstract

Recent advancements in Large Vision-Language Models (LVLMs) highlight their ability to integrate and process multi-modal information. However, hallucinations—where generated content is inconsistent with input vision and instructions—remain a challenge. In this paper, we analyze LVLMs' layer-wise decoding and identify that hallucinations can arise during the reasoning and factual information injection process. Additionally, as the number of generated tokens increases, the forgetting of the original prompt may also lead to hallucinations.To address this, we propose a training-free decoding method called Mixture of Layer Experts (MoLE). MoLE leverages a heuristic gating mechanism to dynamically select multiple layers of LVLMs as expert layers: the Final Expert, the Second Opinion expert, and the Prompt Retention Expert. By the cooperation of each expert, MoLE enhances the robustness and faithfulness of the generation process. Our extensive experiments demonstrate that MoLE significantly reduces hallucinations, outperforming the current state-of-the-art decoding techniques across three mainstream LVLMs and two established hallucination benchmarks. Moreover, our method reveals the potential of LVLMs to independently produce more reliable and accurate outputs.

Downloads

Published

2025-04-11

How to Cite

Liang, T., Du, Y., Huang, J., Kong, M., Chen, L., Li, Y., … Zhu, Q. (2025). MoLE:Decoding by Mixture of Layer Experts Alleviates Hallucination in Large Vision-Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(18), 18684–18692. https://doi.org/10.1609/aaai.v39i18.34056

Issue

Section

AAAI Technical Track on Machine Learning IV