vMFCoOp: Towards Equilibrium on a Unified Hyperspherical Manifold for Prompting Biomedical VLMs

Minye Shao; Sihan Guo; Xinrun Li; Xingyu Miao; Haoran Duan; Yang Long

doi:10.1609/aaai.v40i11.37839

Authors

Minye Shao Durham University
Sihan Guo Durham University
Xinrun Li Durham University
Xingyu Miao Durham University
Haoran Duan Tsinghua University
Yang Long Durham University

DOI:

https://doi.org/10.1609/aaai.v40i11.37839

Abstract

Recent advances in context optimization (CoOp) guided by large language model (LLM)–distilled medical semantic priors offer a scalable alternative to manual prompt engineering and full fine-tuning for adapting biomedical CLIP-based vision-language models (VLMs). However, prompt learning in this context is challenged by semantic misalignment between LLMs and CLIP variants due to divergent training corpora and model architectures; it further lacks scalability across continuously evolving families of foundation models. More critically, pairwise multimodal alignment via conventional Euclidean-space optimization lacks the capacity to model unified representations or apply localized geometric constraints, which tends to amplify modality gaps in complex biomedical imaging and destabilize few-shot adaptation. To address these challenges, we propose vMFCoOp, a framework that inversely estimates von Mises–Fisher (vMF) distributions on a shared Hyperspherical Manifold, aligning semantic biases between arbitrary LLMs and CLIP backbones via Unified Semantic Anchors to achieve robust biomedical prompting and superior few-shot classification. Grounded in three complementary constraints, vMFCoOp demonstrates consistent improvements across 14 medical datasets, 12 medical imaging modalities, and 13 anatomical regions, outperforming state-of-the-art methods in accuracy, generalization, and clinical applicability.

vMFCoOp: Towards Equilibrium on a Unified Hyperspherical Manifold for Prompting Biomedical VLMs

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information