Unlocking Multi-Modal Potentials for Link Prediction on Dynamic Text-Attributed Graphs
DOI:
https://doi.org/10.1609/aaai.v40i32.39956Abstract
Dynamic Text-Attributed Graphs (DyTAGs) are a novel graph paradigm that captures evolving temporal events (edges) alongside rich textual attributes. Existing studies can be broadly categorized into TGNN-driven and LLM-driven approaches, both of which encode textual attributes and temporal structures for DyTAG representation. We observe that DyTAGs inherently comprise three distinct modalities: temporal, textual, and structural, often exhibiting completely disjoint distributions. However, the first two modalities are largely overlooked by existing studies, leading to suboptimal performance. To address this, we propose MoMent, a multi-modal network that explicitly models, integrates, and aligns each modality to learn node representations for link prediction. Given the disjoint nature of the original modality distributions, we first construct modality-specific features and encode them using individual encoders to capture correlations across temporal patterns, semantic context, and local structures. Each encoder generates modality-specific tokens, which are then fused into comprehensive node representations with a theoretical guarantee. To avoid disjoint subspaces of these heterogeneous modalities, we propose a dual-domain alignment loss that first aligns their distributions globally and then fine-tunes coherence at the instance level. This enhances coherent representations from temporal, textual, and structural views. Extensive experiments across seven datasets show that MoMent achieves up to 17.28% accuracy improvement and up to 31x speed-up against eight baselines.Downloads
Published
2026-03-14
How to Cite
Xu, Y., Zhang, W., Zhang, Y., Lin, X., & Xu, X. (2026). Unlocking Multi-Modal Potentials for Link Prediction on Dynamic Text-Attributed Graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 27386–27394. https://doi.org/10.1609/aaai.v40i32.39956
Issue
Section
AAAI Technical Track on Machine Learning IX