ConvMatch: Rethinking Network Design for Two-View Correspondence Learning

Authors

  • Shihua Zhang Wuhan University
  • Jiayi Ma Wuhan University

DOI:

https://doi.org/10.1609/aaai.v37i3.25456

Keywords:

CV: Motion & Tracking, CV: 3D Computer Vision, CV: Image and Video Retrieval

Abstract

Multilayer perceptron (MLP) has been widely used in two-view correspondence learning for only unordered correspondences provided, and it extracts deep features from individual correspondence effectively. However, the problem of lacking context information limits its performance and hence, many extra complex blocks are designed to capture such information in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks. Specifically, with the observation that sparse motion vectors and dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Extensive experiments reveal that ConvMatch with a simple CNN backbone consistently outperforms state-of-the-arts including MLP-based methods for relative pose estimation and homography estimation, and shows promising generalization ability to different datasets and descriptors. Our code is publicly available at https://github.com/SuhZhang/ConvMatch.

Downloads

Published

2023-06-26

How to Cite

Zhang, S., & Ma, J. (2023). ConvMatch: Rethinking Network Design for Two-View Correspondence Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 3472-3479. https://doi.org/10.1609/aaai.v37i3.25456

Issue

Section

AAAI Technical Track on Computer Vision III