LUMIN: A Longitudinal Multi-modal Knowledge Decomposition Network for Predicting Breast Cancer Recurrence

Authors

  • Chunyao Lu Netherlands Cancer Institute Radboud University Medical Centre
  • Tianyu Zhang Netherlands Cancer Institute Radboud University Medical Centre
  • Xinglong Liang Netherlands Cancer Institute Radboud University Medical Centre
  • Yuan Gao Netherlands Cancer Institute Maastricht University Medical Centre
  • Luyi Han Netherlands Cancer Institute Radboud University Medical Centre
  • Xin Wang Netherlands Cancer Institute Maastricht University Medical Centre
  • Nika Rasoolzadeh Netherlands Cancer Institute Radboud University Medical Centre
  • Tao Tan Macao Polytechnic University
  • Ritse Mann Netherlands Cancer Institute Radboud University Medical Centre

DOI:

https://doi.org/10.1609/aaai.v40i9.37693

Abstract

Accurate prediction of breast cancer recurrence after treatment is essential for improving long-term outcomes. However, existing models are limited by three key challenges: (1) they typically rely on single-modal data, missing cross-modal interactions; (2) they analyze static snapshots, failing to capture disease progression over time; and (3) they often perform coarse feature fusion, lacking semantic disentanglement and interpretability. To address these issues, we propose LUMIN (Longitudinal Multi-modal Knowledge Decomposition Network), a novel framework that integrates longitudinal mammograms and electronic health records (EHRs) for recurrence prediction. LUMIN leverages a vision-language contrastive pretraining backbone to align multi-modal representations and introduces two knowledge extraction modules: (1) a Cross-Modal Disentangled Knowledge Extractor (CM-DKE) that separates shared, complementary, and modality-specific information across imaging and text; and (2) a Temporal Evolution Disentangled Knowledge Extractor (TE-DKE) that captures time-invariant, time-varying, and time-specific features to model disease dynamics. Experiments on a large-scale dataset of 3,924 patients and 19,684 exams show that LUMIN significantly outperforms state-of-the-art baselines, demonstrating its effectiveness in capturing both multi-modal semantics and temporal heterogeneity for recurrence prediction.

Downloads

Published

2026-03-14

How to Cite

Lu, C., Zhang, T., Liang, X., Gao, Y., Han, L., Wang, X., … Mann, R. (2026). LUMIN: A Longitudinal Multi-modal Knowledge Decomposition Network for Predicting Breast Cancer Recurrence. Proceedings of the AAAI Conference on Artificial Intelligence, 40(9), 7530–7538. https://doi.org/10.1609/aaai.v40i9.37693

Issue

Section

AAAI Technical Track on Computer Vision VI