Domain-Controlled Prompt Learning

Qinglong Cao; Zhengqin Xu; Yuntian Chen; Chao Ma; Xiaokang Yang

doi:10.1609/aaai.v38i2.27853

Authors

Qinglong Cao MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo
Zhengqin Xu MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
Yuntian Chen Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo
Chao Ma MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
Xiaokang Yang MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v38i2.27853

Keywords:

CV: Language and Vision, CV: Large Vision Models, CV: Multi-modal Vision

Abstract

Large pre-trained vision-language models, such as CLIP, have shown remarkable generalization capabilities across various tasks when appropriate text prompts are provided. However, adapting these models to specific domains, like remote sensing images (RSIs), medical images, etc, remains unexplored and challenging. Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms, leading to suboptimal performance due to the misinterpretation of specific images in natural image patterns. To tackle this dilemma, we proposed a Domain-Controlled Prompt Learning for the specific domains. Specifically, the large-scale specific domain foundation model (LSDM) is first introduced to provide essential specific domain knowledge. Using lightweight neural networks, we transfer this knowledge into domain biases, which control both the visual and language branches to obtain domain-adaptive prompts in a directly incorporating manner. Simultaneously, to overcome the existing overfitting challenge, we propose a novel noisy-adding strategy, without extra trainable parameters, to help the model escape the suboptimal solution in a global domain oscillation manner. Experimental results show our method achieves state-of-the-art performance in specific domain image recognition datasets. Our code is available at https://github.com/caoql98/DCPL.

Domain-Controlled Prompt Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription