DART: Dual-Modal Adaptive Online Prompting and Knowledge Retention for Test-Time Adaptation

Authors

  • Zichen Liu Peking University
  • Hongbo Sun Peking University
  • Yuxin Peng Peking University
  • Jiahuan Zhou Peking University

DOI:

https://doi.org/10.1609/aaai.v38i13.29320

Keywords:

ML: Transfer, Domain Adaptation, Multi-Task Learning, CV: Language and Vision

Abstract

As an up-and-coming area, CLIP-based pre-trained vision-language models can readily facilitate downstream tasks through the zero-shot or few-shot fine-tuning manners. However, they still face critical challenges in test-time generalization due to the shifts between the training and test data distributions, hindering the further improvement of the performance. To address this crucial problem, the latest works have introduced Test-Time Adaptation (TTA) techniques to CLIP which dynamically learn text prompts using only test samples. However, their limited learning capacity due to the overlook of visual modality information, and the underutilization of knowledge in previously seen test samples result in reduced performance. In this paper, we propose a novel Dual-modal Adaptive online prompting and knowledge ReTention method called DART to overcome these challenges. To increase the learning capacity, DART captures knowledge from each test sample by learning class-specific text prompts and instance-level image prompts. Additionally, to fully leverage the knowledge from previously seen test samples, DART utilizes dual-modal knowledge retention prompts to adaptively retain the acquired knowledge, thereby enhancing the predictions on subsequent test samples. Extensive experiments on various large-scale benchmarks demonstrate the effectiveness of our proposed DART against state-of-the-art methods.

Published

2024-03-24

How to Cite

Liu, Z., Sun, H., Peng, Y., & Zhou, J. (2024). DART: Dual-Modal Adaptive Online Prompting and Knowledge Retention for Test-Time Adaptation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(13), 14106-14114. https://doi.org/10.1609/aaai.v38i13.29320

Issue

Section

AAAI Technical Track on Machine Learning IV