Dynamic Anchor Learning for Arbitrary-Oriented Object Detection
Keywords:Object Detection & Categorization
AbstractArbitrary-oriented objects widely appear in natural scenes, aerial photographs, remote sensing images, etc., and thus arbitrary-oriented object detection has received considerable attention. Many current rotation detectors use plenty of anchors with different orientations to achieve spatial alignment with ground truth boxes. Intersection-over-Union (IoU) is then applied to sample the positive and negative candidates for training. However, we observe that the selected positive anchors cannot always ensure accurate detections after regression, while some negative samples can achieve accurate localization. It indicates that the quality assessment of anchors through IoU is not appropriate, and this further leads to inconsistency between classification confidence and localization accuracy. In this paper, we propose a dynamic anchor learning (DAL) method, which utilizes the newly defined matching degree to comprehensively evaluate the localization potential of the anchors and carries out a more efficient label assignment process. In this way, the detector can dynamically select high-quality anchors to achieve accurate object detection, and the divergence between classification and regression will be alleviated. With the newly introduced DAL, we can achieve superior detection performance for arbitrary-oriented objects with only a few horizontal preset anchors. Experimental results on three remote sensing datasets HRSC2016, DOTA, UCAS-AOD as well as a scene text dataset ICDAR 2015 show that our method achieves substantial improvement compared with the baseline model. Besides, our approach is also universal for object detection using horizontal bound box. The code and models are available at https://github.com/ming71/DAL.
How to Cite
Ming, Q., Zhou, Z., Miao, L., Zhang, H., & Li, L. (2021). Dynamic Anchor Learning for Arbitrary-Oriented Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 2355-2363. https://doi.org/10.1609/aaai.v35i3.16336
AAAI Technical Track on Computer Vision II