CLIP-driven View-aware Prompt Learning for Unsupervised Vehicle Re-identification

Jiyang Xu; Qi Wang; Xin Xiong; Di Gai; Ruihua Zhou; Dong Wang

doi:10.1609/aaai.v39i8.32962

Authors

Jiyang Xu Nanchang University
Qi Wang Nanchang University
Xin Xiong Nanchang University
Di Gai Nanchang University
Ruihua Zhou Nanchang University
Dong Wang Nanchang University

DOI:

https://doi.org/10.1609/aaai.v39i8.32962

Abstract

With the emergence of vision-language pre-trained models, such as CLIP, some textual prompts have been gradually introduced recently into re-identification (Re-ID) tasks to obtain considerably robust multimodal information. However, most textual descriptions based on vehicle Re-ID tasks only contain identity index words without specific words to describe vehicle view information, thereby resulting in difficulty to be widely applied in vehicle Re-ID tasks with view variations. This case inspires us to propose a CLIP-driven view-aware prompt learning framework for unsupervised vehicle Re-ID. We first design a learnable textual prompt template called view-aware context optimization (ViewCoOp) based on dynamic multi-view word embeddings, which can fully obtain the proportion and position encoding of each view in the whole vehicle body region. Subsequently, a cross-modal mutual graph is constructed to explore the connections between inter-modal and intra-modal. Each sample is treated as a graph node, which extracts textual features based on ViewCoOp and the visual features of images. Moreover, leveraging the inter-cluster and intra-cluster correlation in the bimodal clustering results in the determination of connectivity between graph node pairs. Lastly, the proposed cross-modal mutual graph method utilizes supervised information from the bimodal gap to directly fine-tune the image encoder of CLIP for downstream unsupervised vehicle Re-ID tasks. Extensive experiments verify that the proposed method is capable of effectively obtaining cross-modal description ability from multiple views.

CLIP-driven View-aware Prompt Learning for Unsupervised Vehicle Re-identification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information