DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation

Dongfang Liu; Yiming Cui; Liqi Yan; Christos Mousas; Baijian Yang; Yingjie Chen

doi:10.1609/aaai.v35i7.16760

Authors

Dongfang Liu Purdue University
Yiming Cui University of Florida
Liqi Yan Fudan University
Christos Mousas Purdue University
Baijian Yang Purdue University
Yingjie Chen Purdue University

DOI:

https://doi.org/10.1609/aaai.v35i7.16760

Keywords:

Localization, Mapping, and Navigation

Abstract

In this work, we introduce a Denser Feature Network(DenserNet) for visual localization. Our work provides three principal contributions. First, we develop a convolutional neural network (CNN) architecture which aggregates feature maps at different semantic levels for image representations. Using denser feature maps, our method can produce more key point features and increase image retrieval accuracy. Second, our model is trained end-to-end without pixel-level an-notation other than positive and negative GPS-tagged image pairs. We use a weakly supervised triplet ranking loss to learn discriminative features and encourage keypoint feature repeatability for image representation. Finally, our method is computationally efficient as our architecture has shared features and parameters during forwarding propagation. Our method is flexible and can be crafted on a light-weighted backbone architecture to achieve appealing efficiency with a small penalty on accuracy. Extensive experiment results indicate that our method sets a new state-of-the-art on four challenging large-scale localization benchmarks and three image retrieval benchmarks with the same level of supervision. The code is available at https://github.com/goodproj13/DenserNet

DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription