Correspondence Coverage Matters for Multi-Modal Dataset Distillation

Authors

  • Zhuohang Dang School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University State Key Laboratory of Communication Content Cognition, China
  • Minnan Luo School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University State Key Laboratory of Communication Content Cognition, China
  • Chengyou Jia School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University State Key Laboratory of Communication Content Cognition, China
  • Hangwei Qian Centre for Frontier AI Research, Agency for Science, Technology and Research (A*STAR), Singapore Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
  • Xinyu Zhang School of Computer Science and Technology, MOEKLINNS Laboratory, Xi'an Jiaotong University
  • Xiaojun Chang University of Science and Technology of China
  • Ivor Tsang Centre for Frontier AI Research, Agency for Science, Technology and Research (A*STAR), Singapore Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore

DOI:

https://doi.org/10.1609/aaai.v40i25.39207

Abstract

Multi-modal dataset distillation (DD) condenses large datasets into compact ones that retain task efficacy by capturing correspondence patterns, i.e., shared semantics between paired modalities. However, such patterns rely on cross-modal similarity and cannot be faithfully captured by intra-modal similarity of current unimodal strategies. As a result, current multi-modal DD methods tend to over-concentrate, redundantly encoding similar correspondence patterns and thus limiting generalizability. To this end, we propose a novel multi-modal DD framework to systematically Promote Correspondence coverage, i.e., ProCo. Initially, we develop a correspondence consistency metric based on cross-modal retrieval distributions to cluster correspondence patterns. These clusters capture the underlying correspondence distribution, enabling ProCo to initialize distilled data with representative patterns while regularizing optimization to promote correspondence representativeness and diversity. Moreover, we employ conditional neural fields for efficient distilled data parameterization, enhancing fine-grained pattern capture while allowing more distilled data under a fixed budget to boost correspondence coverage. Extensive experiments verify that our ProCo achieves superior and elastic budget-efficacy trade-offs, surpassing prior methods by over 15% with 10x distillation budget reduction, highlighting its real-world practicality.

Downloads

Published

2026-03-14

How to Cite

Dang, Z., Luo, M., Jia, C., Qian, H., Zhang, X., Chang, X., & Tsang, I. (2026). Correspondence Coverage Matters for Multi-Modal Dataset Distillation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 20693–20701. https://doi.org/10.1609/aaai.v40i25.39207

Issue

Section

AAAI Technical Track on Machine Learning II