R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

Authors

  • Zeming Li Tsinghua University
  • Yilun Chen Megvii Inc
  • Gang Yu Megvii Inc
  • Yangdong Deng Tsinghua University

Keywords:

Computer Vision, Object Detection

Abstract

Region based detectors like Faster R-CNN and R-FCN have achieved leading performance on object detection benchmarks. However, in Faster R-CNN, RoI pooling is used to extract feature of each region, which might harm the classification as the RoI pooling loses spatial resolution. Also it gets slow when a large number of proposals are utilized. R-FCN is a fully convolutional structure that uses a position-sensitive pooling layer to extract prediction score of each region, which speeds up network by sharing computation of RoIs and prevents the feature map from losing information in RoI-pooling. But R-FCN can not benefit from fully connected layer (or global average pooling), which enables Faster R-CNN to utilize global context information. In this paper, we propose R-FCN++ to address this issue in two-fold: first we involve Global Context Module to improve the classification score maps by adopting large, separable convolutional kernels. Second we introduce a new pooling method to better extract scores from the score maps, by using row-wise or column-wise max pooling. Our approach achieves state-of-the-art single-model results on both Pascal VOC and MS COCO object detection benchmarks, 87.3% on Pascal VOC 2012 test dataset and 42.3% on COCO 2015 test-dev dataset. Code will be made publicly available.

Downloads

Published

2018-04-27

How to Cite

Li, Z., Chen, Y., Yu, G., & Deng, Y. (2018). R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/12265