Geometry-Aware Stereo Matching via Monocular Disparity Distribution Prior and Gradient Enhancement

Authors

  • Junze Zhang Academy of Military Sciences Intelligent Game and Decision Lab
  • Luoxi Jing Peking University
  • Yuanyuan Wang Academy of Military Sciences
  • Xueqi Li Academy of Military Sciences
  • Guoli Yang Advanced Institute of Big Data
  • Songchang Jin Intelligent Game and Decision Lab
  • Chunping Qiu Intelligent Game and Decision Lab

DOI:

https://doi.org/10.1609/aaai.v40i15.38253

Abstract

Stereo matching recovers 3D scene information based on the correlation between corresponding pixels. Despite impressive progress, existing methods lack sufficient correlation priors in ill-posed regions such as occlusions, detailed and reflective regions. In this paper, we propose Geometry Aware Stereo Matching Network (GEAStereo) to enhance geometric structure perception and address this issue. We adaptively incorporate the Monocular Disparity Distribution Prior into the stereo cost volume, building Mono-Stereo Fusion Volume (MSFV), which effectively captures global geometric structures and rectifies the correlation information in ill-posed regions. Furthermore, we introduce rich detail information from gradient features and construct a Detail-Aware Volume (DAV) by aggregating the group-wise cost volume under the guidance of gradient spatial attention, thus enhancing the correlation modeling in detailed structures. Jointly, MSFV and DAV provide rich correlation priors for disparity iterative optimization. Experimental results show that our method achieves competitive results on the ETH3D and KITTI2015 benchmarks. Compared with the state-of-the-art methods, our method demonstrates stronger performance in zero-shot generalization.

Downloads

Published

2026-03-14

How to Cite

Zhang, J., Jing, L., Wang, Y., Li, X., Yang, G., Jin, S., & Qiu, C. (2026). Geometry-Aware Stereo Matching via Monocular Disparity Distribution Prior and Gradient Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 12582–12590. https://doi.org/10.1609/aaai.v40i15.38253

Issue

Section

AAAI Technical Track on Computer Vision XII