Revisiting MLLM Based Image Quality Assessment: Errors and Remedy

Zhenchen Tang; Songlin Yang; Bo Peng; Zichuan Wang; Jing Dong

doi:10.1609/aaai.v40i11.37908

Authors

Zhenchen Tang New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
Songlin Yang MMLab@HKUST, The Hong Kong University of Science and Technology
Bo Peng New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
Zichuan Wang New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
Jing Dong New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i11.37908

Abstract

The rapid progress of multi-modal large language models (MLLMs) has boosted the task of image quality assessment (IQA). However, a key challenge arises from the inherent mismatch between the discrete token outputs of MLLMs and the continuous nature of quality scores required by IQA tasks. This discrepancy significantly hinders the performance of MLLM-based IQA methods. Previous approaches that convert discrete token predictions into continuous scores often suffer from conversion errors. Moreover, the semantic confusion introduced by level tokens (e.g., “good”) further constrains the performance of MLLMs on IQA tasks and degrades their original capabilities to related tasks. To tackle these problems, we provide a theoretical analysis of the errors inherent in previous approaches and, motivated by this analysis, propose a simple yet effective framework, Q-Scorer. This framework incorporates a lightweight regression module and IQA-specific score tokens into the MLLM pipeline. Extensive experiments demonstrate that Q-Scorer achieves state-of-the-art performance across multiple IQA benchmarks, generalizes well to mixed datasets, and further improves combined with other methods.

Revisiting MLLM Based Image Quality Assessment: Errors and Remedy

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information