Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Authors

  • Jian Lang University of Electronic Science and Technology of China
  • Zhangtao Cheng University of Electronic Science and Technology of China
  • Ting Zhong University of Electronic Science and Technology of China Kash Institute of Electronics and Information Industry
  • Fan Zhou University of Electronic Science and Technology of China Kash Institute of Electronics and Information Industry

DOI:

https://doi.org/10.1609/aaai.v39i17.33984

Abstract

Multimodal learning with incomplete modality is practical and challenging. Recently, researchers have focused on enhancing the robustness of pre-trained MultiModal Transformers (MMTs) under missing modality conditions by applying learnable prompts. However, these prompt-based methods face several limitations: (1) incomplete modalities provide restricted modal cues for task-specific inference, (2) dummy imputation for missing content causes information loss and introduces noise, and (3) static prompts are instance-agnostic, offering limited knowledge for instances with various missing conditions. To address these issues, we propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework. RAGPT comprises three modules: (I) the multi-channel retriever, which identifies similar instances through a within-modality retrieval strategy, (II) the missing modality generator, which recovers missing information using retrieved contexts, and (III) the context-aware prompter, which captures contextual knowledge from relevant instances and generates dynamic prompts to largely enhance the MMT’s robustness. Extensive experiments conducted on three real-world datasets show that RAGPT consistently outperforms all competitive baselines in handling incomplete modality problems.

Downloads

Published

2025-04-11

How to Cite

Lang, J., Cheng, Z., Zhong, T., & Zhou, F. (2025). Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(17), 18035-18043. https://doi.org/10.1609/aaai.v39i17.33984

Issue

Section

AAAI Technical Track on Machine Learning III