Tracklet Self-Supervised Learning for Unsupervised Person Re-Identification

Guile Wu; Xiatian Zhu; Shaogang Gong

doi:10.1609/aaai.v34i07.6921

Authors

Guile Wu Queen Mary University of London
Xiatian Zhu Vision Semantics Limited
Shaogang Gong Queen Mary University of London

DOI:

https://doi.org/10.1609/aaai.v34i07.6921

Abstract

Existing unsupervised person re-identification (re-id) methods mainly focus on cross-domain adaptation or one-shot learning. Although they are more scalable than the supervised learning counterparts, relying on a relevant labelled source domain or one labelled tracklet per person initialisation still restricts their scalability in real-world deployments. To alleviate these problems, some recent studies develop unsupervised tracklet association and bottom-up image clustering methods, but they still rely on explicit camera annotation or merely utilise suboptimal global clustering. In this work, we formulate a novel tracklet self-supervised learning (TSSL) method, which is capable of capitalising directly from abundant unlabelled tracklet data, to optimise a feature embedding space for both video and image unsupervised re-id. This is achieved by designing a comprehensive unsupervised learning objective that accounts for tracklet frame coherence, tracklet neighbourhood compactness, and tracklet cluster structure in a unified formulation. As a pure unsupervised learning re-id model, TSSL is end-to-end trainable at the absence of source data annotation, person identity labels, and camera prior knowledge. Extensive experiments demonstrate the superiority of TSSL over a wide variety of the state-of-the-art alternative methods on four large-scale person re-id benchmarks, including Market-1501, DukeMTMC-ReID, MARS and DukeMTMC-VideoReID.

Tracklet Self-Supervised Learning for Unsupervised Person Re-Identification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information