Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search
Keywords:Image and Video Retrieval
AbstractThe goal of person search is to localize and match query persons from scene images. For high efficiency, one-step methods have been developed to jointly handle the pedestrian detection and identification sub-tasks using a single network. There are two major challenges in the current one-step approaches. One is the mutual interference between the optimization objectives of multiple sub-tasks. The other is the sub-optimal identification feature learning caused by small batch size when end-to-end training. To overcome these problems, we propose a decoupled and memory-reinforced network (DMRNet). Specifically, to reconcile the conflicts of multiple objectives, we simplify the standard tightly coupled pipelines and establish a deeply decoupled multi-task learning framework. Further, we build a memory-reinforced mechanism to boost the identification feature learning. By queuing the identification features of recently accessed instances into a memory bank, the mechanism augments the similarity pair construction for pairwise metric learning. For better encoding consistency of the stored features, a slow-moving average of the network is applied for extracting these features. In this way, the dual networks reinforce each other and converge to robust solution states. Experimentally, the proposed method obtains 93.2% and 46.9% mAP on CUHK-SYSU and PRW datasets, which exceeds all the existing one-step methods.
How to Cite
Han, C., Zheng, Z., Gao, C., Sang, N., & Yang, Y. (2021). Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 1505-1512. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16241
AAAI Technical Track on Computer Vision I