Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input

Authors

  • Chenxu Li University of Science and Technology of China
  • Zhicai Wang University of Science and Technology of China
  • Yuan Sheng University of Science and Technology of China
  • Xingyu Zhu University of Science and Technology of China
  • Yanbin Hao Hefei University of Technology
  • Xiang Wang University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i37.40420

Abstract

Multimodal Large Language Models (MLLMs) increasingly support dynamic image resolutions. However, current evaluation paradigms primarily assess semantic performance, overlooking the critical question of resolution robustness - whether performance remains stable across varying input resolutions. To address this gap, we introduce Res-Bench, a comprehensive benchmark comprising 14,400 samples across 12 resolution levels and six core capability dimensions. We designed a novel evaluation framework that goes beyond traditional accuracy metrics to capture performance stability. This framework introduces multiple robustness metrics: Spearman's correlation for assessing resolution-performance trends, and Absolute/Relative Continuous Error (ACE/RCE) for measuring performance volatility. Using these metrics, we conducted a large-scale evaluation of leading MLLMs. Our analysis encompasses: (1) model-centric and task-centric robustness examination, (2) investigation of preprocessing strategies including padding and super-resolution, and (3) exploration of fine-tuning for stability enhancement.

Downloads

Published

2026-03-14

How to Cite

Li, C., Wang, Z., Sheng, Y., Zhu, X., Hao, Y., & Wang, X. (2026). Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input. Proceedings of the AAAI Conference on Artificial Intelligence, 40(37), 31545–31553. https://doi.org/10.1609/aaai.v40i37.40420

Issue

Section

AAAI Technical Track on Natural Language Processing II