Divide and Conquer: Hybrid Pre-training for Person Search

Authors

  • Yanling Tian Nanjing University of Science and Technology
  • Di Chen Nanjing University of Science and Technology
  • Yunan Liu Nanjing University of Science and Technology Dalian Maritime University
  • Jian Yang Nanjing University of Science and Technology
  • Shanshan Zhang Nanjing University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v38i6.28329

Keywords:

CV: Object Detection & Categorization, CV: Representation Learning for Vision

Abstract

Large-scale pre-training has proven to be an effective method for improving performance across different tasks. Current person search methods use ImageNet pre-trained models for feature extraction, yet it is not an optimal solution due to the gap between the pre-training task and person search task (as a downstream task). Therefore, in this paper, we focus on pre-training for person search, which involves detecting and re-identifying individuals simultaneously. Although labeled data for person search is scarce, datasets for two sub-tasks person detection and re-identification are relatively abundant. To this end, we propose a hybrid pre-training framework specifically designed for person search using sub-task data only. It consists of a hybrid learning paradigm that handles data with different kinds of supervisions, and an intra-task alignment module that alleviates domain discrepancy under limited resources. To the best of our knowledge, this is the first work that investigates how to support full-task pre-training using sub-task data. Extensive experiments demonstrate that our pre-trained model can achieve significant improvements across diverse protocols, such as person search method, fine-tuning data, pre-training data and model backbone. For example, our model improves ResNet50 based NAE by 10.3% relative improvement w.r.t. mAP. Our code and pre-trained models are released for plug-and-play usage to the person search community (https://github.com/personsearch/PretrainPS).

Published

2024-03-24

How to Cite

Tian, Y., Chen, D., Liu, Y., Yang, J., & Zhang, S. (2024). Divide and Conquer: Hybrid Pre-training for Person Search. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5224–5232. https://doi.org/10.1609/aaai.v38i6.28329

Issue

Section

AAAI Technical Track on Computer Vision V