Explore How to Inject Beneficial Noise in MLLMs

Authors

  • Ruishu Zhu School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University Institute of Artificial Intelligence (TeleAI), China Telecom
  • Sida Huang School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University Institute of Artificial Intelligence (TeleAI), China Telecom
  • Ziheng Jiao HuaWei Technologies Co., Ltd.
  • Hongyuan Zhang Institute of Artificial Intelligence (TeleAI), China Telecom The University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v40i34.40153

Abstract

Multimodal Large Language Models (MLLMs) have played an increasingly important role in multimodal intelligence. However, the existing fine-tuning methods often ignore cross-modal heterogeneity, limiting their full potential. In this work, we propose a novel fine-tuning strategy by injecting beneficial random noise, which outperforms previous methods and even surpasses full fine-tuning, with minimal additional parameters. The proposed Multimodal Noise Generator (MuNG) enables efficient modality fine-tuning by injecting customized noise into the frozen MLLMs. Specifically, we reformulate the reasoning process of MLLMs from a variational inference perspective, upon which we design a multimodal noise generator that dynamically analyzes cross-modal relationships in image-text pairs to generate task adaptive beneficial noise. Injecting this type of noise into the MLLMs effectively suppresses irrelevant semantic components, leading to significantly improved cross-modal representation alignment and enhanced performance on downstream tasks. Experiments on two mainstream MLLMs, QwenVL and LLaVA, demonstrate that our method surpasses full parameter fine-tuning and other existing fine-tuning approaches, while requiring adjustments to only about 1~2% additional parameters.

Downloads

Published

2026-03-14

How to Cite

Zhu, R., Huang, S., Jiao, Z., & Zhang, H. (2026). Explore How to Inject Beneficial Noise in MLLMs. Proceedings of the AAAI Conference on Artificial Intelligence, 40(34), 29150–29158. https://doi.org/10.1609/aaai.v40i34.40153

Issue

Section

AAAI Technical Track on Machine Learning XI