MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains

Leyan Xue; Changqing Zhang; Kecheng Xue; Xiaohong Liu; Guangyu Wang; Zongbo Han

doi:10.1609/aaai.v40i32.39963

Authors

Leyan Xue School of Artificial Intelligence, Tianjin University
Changqing Zhang School of Artificial Intelligence, Tianjin University
Kecheng Xue State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
Xiaohong Liu Institute of Medical Artificial Intelligence, South China Hospital, Medical School, Shenzhen University
Guangyu Wang State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
Zongbo Han State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v40i32.39963

Abstract

Although multimodal fusion has made significant progress, its advancement is severely hindered by the lack of adequate evaluation benchmarks. Current fusion methods are typically evaluated on a small selection of public datasets, a limited scope that inadequately represents the complexity and diversity of real-world scenarios, potentially leading to biased evaluations. This issue presents a twofold challenge. On one hand, models may overfit to the biases of specific datasets, hindering their generalization to broader practical applications. On the other hand, the absence of a unified evaluation standard makes fair and objective comparisons between different fusion methods difficult. Consequently, a truly universal and high-performance fusion model has yet to emerge. To address these challenges, we have developed a large-scale, domain-adaptive benchmark for multimodal evaluation. This benchmark integrates over 30 datasets, encompassing 15 modalities and 20 predictive tasks across key application domains. To complement this, we have also developed an open-source, unified, and automated evaluation pipeline that includes standardized implementations of state-of-the-art models and diverse fusion paradigms. Leveraging this platform, we have conducted large-scale experiments, successfully establishing new performance baselines across multiple tasks. This work provides the academic community with a crucial platform for rigorous and reproducible assessment of multimodal models, aiming to propel the field of multimodal artificial intelligence to new heights.

MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information