Gradient Corner Pooling for Keypoint-Based Object Detection

Authors

  • Xuyang Li Xidian University
  • Xuemei Xie Xidian University Pazhou Lab, Huangpu
  • Mingxuan Yu Xidian University
  • Jiakai Luo Xidian University
  • Chengwei Rao Xidian University
  • Guangming Shi Xidian University Peng Cheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v37i2.25231

Keywords:

CV: Object Detection & Categorization, CV: Learning & Optimization for CV

Abstract

Detecting objects as multiple keypoints is an important approach in the anchor-free object detection methods while corner pooling is an effective feature encoding method for corner positioning. The corners of the bounding box are located by summing the feature maps which are max-pooled in the x and y directions respectively by corner pooling. In the unidirectional max pooling operation, the features of the densely arranged objects of the same class are prone to occlusion. To this end, we propose a method named Gradient Corner Pooling. The spatial distance information of objects on the feature map is encoded during the unidirectional pooling process, which effectively alleviates the occlusion of the homogeneous object features. Further, the computational complexity of gradient corner pooling is the same as traditional corner pooling and hence it can be implemented efficiently. Gradient corner pooling obtains consistent improvements for various keypoint-based methods by directly replacing corner pooling. We verify the gradient corner pooling algorithm on the dataset and in real scenarios, respectively. The networks with gradient corner pooling located the corner points earlier in the training process and achieve an average accuracy improvement of 0.2%-1.6% on the MS-COCO dataset. The detectors with gradient corner pooling show better angle adaptability for arrayed objects in the actual scene test.

Downloads

Published

2023-06-26

How to Cite

Li, X., Xie, X., Yu, M., Luo, J., Rao, C., & Shi, G. (2023). Gradient Corner Pooling for Keypoint-Based Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1460-1467. https://doi.org/10.1609/aaai.v37i2.25231

Issue

Section

AAAI Technical Track on Computer Vision II