Causality-Aligned Semantic Recovery for Incomplete Cross-Modal Retrieval

Authors

  • Haipeng Chen College of Computer Science and Technology, Jilin University, China Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, China
  • Yu Liu College of Computer Science and Technology, Jilin University, China
  • Xun Yang University of Science and Technology of China
  • Yuheng Liang College of Computer Science and Technology, Jilin University, China
  • Yingda Lyu Public Computer Education and Research Center, Jilin University, China

DOI:

https://doi.org/10.1609/aaai.v40i24.39090

Abstract

Incomplete cross-modal retrieval (ICMR) requires models to recover missing modalities and robustly align heterogeneous ones for effective retrieval. Existing methods, however, fall short in both aspects. They often rely on limited semantic cues, such as single samples or coarse category prototypes, which compromises reconstruction quality. Moreover, these approaches are vulnerable to learning spurious cross-modal correlations, thereby impairing accurate alignment and hindering retrieval performance. To address these challenges, we propose Causality-Aligned Semantic Recovery (CASR), a novel method designed to both comprehensively restore missing modalities and mitigate spurious associations between vision and language. Our CASR involves two essential components: i) the Missing Modality Imagination (MMI) module, which combines category semantic priors with relevant contextual information to achieve high-quality semantic reconstruction; ii) the Explicit Causal Alignment (ECA) module, which explicitly learns environment-invariant attention, effectively eliminating the interference of spurious correlations and improving retrieval performance. Furthermore, we extend CASR to the challenging task of Partially Aligned Cross-Modal Retrieval, where we treat unlabeled unpaired data as a form of incomplete data. By leveraging MMI and ECA modules, we are able to learn robust representations in this setting. Extensive experiments on benchmark datasets under various missing rates demonstrate that CASR achieves superior robustness and retrieval performance.

Downloads

Published

2026-03-14

How to Cite

Chen, H., Liu, Y., Yang, X., Liang, Y., & Lyu, Y. (2026). Causality-Aligned Semantic Recovery for Incomplete Cross-Modal Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 20050–20058. https://doi.org/10.1609/aaai.v40i24.39090

Issue

Section

AAAI Technical Track on Machine Learning I