Generalizing Vision-Language Models with Dedicated Prompt Guidance

Xinyao Li; Yinjie Min; Hongbo Chen; Zhekai Du; Fengling Li; Jingjing Li

doi:10.1609/aaai.v40i28.39492

Authors

Xinyao Li School of Computer Science and Engineering, University of Electronic Science and Technology of China
Yinjie Min School of Statistics and Data Science, Nankai University
Hongbo Chen School of Computer Science and Engineering, University of Electronic Science and Technology of China
Zhekai Du School of Computer Science and Engineering, University of Electronic Science and Technology of China
Fengling Li University of Technology Sydney
Jingjing Li School of Computer Science and Engineering, University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i28.39492

Abstract

Fine-tuning large pretrained vision-language models (VLMs) has emerged as a prevalent paradigm for downstream adaptation, yet it faces a critical trade-off between domain specificity and domain generalization (DG) ability. Current methods typically fine-tune a universal model on the entire dataset, which potentially compromises the ability to generalize to unseen domains. To fill this gap, we provide a theoretical understanding of the generalization ability for VLM fine-tuning, which reveals that training multiple parameter-efficient expert models on partitioned source domains leads to better generalization than fine-tuning a universal model. Inspired by this finding, we propose a two-step domain-expert-Guided DG (GuiDG) framework. GuiDG first employs prompt tuning to obtain source domain experts, then introduces a Cross-Modal Attention module to guide the fine-tuning of the vision encoder via adaptive expert integration. To better evaluate few-shot DG, we construct ImageNet-DG from ImageNet and its variants. Extensive experiments on standard DG benchmarks and ImageNet-DG demonstrate that GuiDG improves upon state-of-the-art fine-tuning methods while maintaining efficiency.

Generalizing Vision-Language Models with Dedicated Prompt Guidance

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information