Disentangling for Transfer: Boosting Limited Modalities via Information-Theoretic Regularization and Cross-Modal Reconstruction
DOI:
https://doi.org/10.1609/aaai.v40i15.38305Abstract
Missing critical modalities in medical imaging poses significant challenges for AI-driven diagnostic systems, particularly in scenarios where limited modalities must suffice for downstream tasks. Existing approaches often fail to fully leverage privileged features available only at training or address the information gap between privileged and limited modalities, resulting in suboptimal performance. To address this, we propose a unified, dual-stage Disentanglement-AligNmenT framEwork (DANTE), which uses InformationTheoretic Regularization and Cross-Modal Reconstruction to decompose full-modality information into alignable and privileged-exclusive components. In the first stage, a self-supervised pre-training strategy based on cross-modal reconstruction acts as a proxy task to implicitly incentivize disentangled representations. In the second stage, we present an information-theoretic regularization to explicitly maximize the transfer of privileged knowledge through two novel modules: (1) a Mutual Alignment Module that employs multilevel bidirectional alignment between limited-modality features and alignable features, enhancing cross-modal representation consistency; (2) a Privileged Compaction Module that restricts the privileged-exclusive information flow, promoting the integration of task-relevant content into alignable representations. Experimental results on three challenging medical datasets demonstrate that DANTE achieves state-of-the-art performance, demonstrating its effectiveness in leveraging privileged guidance under modality scarcity, and exhibits broad applicability across diverse medical imaging scenarios.Published
2026-03-14
How to Cite
Zhang, Z., Zhou, Y.-J., Hu, Y., Ma, X., Yuan, Z., Wang, Z., … Xu, M. (2026). Disentangling for Transfer: Boosting Limited Modalities via Information-Theoretic Regularization and Cross-Modal Reconstruction. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 13052–13060. https://doi.org/10.1609/aaai.v40i15.38305
Issue
Section
AAAI Technical Track on Computer Vision XII