Camera-Aware Proxies for Unsupervised Person Re-Identification

Authors

  • Menglin Wang Zhejiang University
  • Baisheng Lai Alibaba Group
  • Jianqiang Huang Alibaba Group
  • Xiaojin Gong Zhejiang University
  • Xian-Sheng Hua Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v35i4.16381

Keywords:

Image and Video Retrieval, Applications

Abstract

This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations. Some previous methods adopt clustering techniques to generate pseudo labels and use the produced labels to train Re-ID models progressively. These methods are relatively simple but effective. However, most clustering-based methods take each cluster as a pseudo identity class, neglecting the large intra-ID variance caused mainly by the change of camera views. To address this issue, we propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera. These camera-aware proxies enable us to deal with large intra-ID variance and generate more reliable pseudo labels for learning. Based on the camera-aware proxies, we design both intra and inter-camera contrastive learning components for our Re-ID model to effectively learn the ID discrimination ability within and across cameras. Meanwhile, a proxy-balanced sampling strategy is also designed, which facilitates our learning further. Extensive experiments on three large-scale Re-ID datasets show that our proposed approach outperforms most unsupervised methods by a significant margin. Especially, on the challenging MSMT17 dataset, we gain 14.3 percent Rank-1 and 10.2 percent mAP improvements when compared to the second place. Code is available at: https://github.com/Terminator8758/CAP-master.

Downloads

Published

2021-05-18

How to Cite

Wang, M., Lai, B., Huang, J., Gong, X., & Hua, X.-S. (2021). Camera-Aware Proxies for Unsupervised Person Re-Identification. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 2764-2772. https://doi.org/10.1609/aaai.v35i4.16381

Issue

Section

AAAI Technical Track on Computer Vision III