Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior

Authors

  • Lee Hyoseok Grad.School of Artificial Intelligence, POSTECH
  • Kyeong Seon Kim Dept. of Electrical Engineering, POSTECH
  • Kwon Byung-Ki Grad.School of Artificial Intelligence, POSTECH
  • Tae-Hyun Oh Grad.School of Artificial Intelligence, POSTECH Dept. of Electrical Engineering, POSTECH Institute for Convergence Research and Education in Advanced Technology, Yonsei University

DOI:

https://doi.org/10.1609/aaai.v39i4.32405

Abstract

Depth completion, predicting dense depth maps from sparse depth measurements, is an ill-posed problem requiring prior knowledge. Recent methods adopt learning-based approaches to implicitly capture priors, but the priors primarily fit in-domain data and do not generalize well to out-of-domain scenarios. To address this, we propose a zero-shot depth completion method composed of an affine-invariant depth diffusion model and test-time alignment. We use pre-trained depth diffusion models as depth prior knowledge, which implicitly understand how to fill in depth for scenes. Our approach aligns the affine-invariant depth prior with metric-scale sparse measurements, enforcing them as hard constraints via an optimization loop at test-time. Our zero-shot depth completion method demonstrates generalization across various domain datasets, achieving up to a 21% average performance improvement over the previous state-of-the-art methods while enhancing spatial understanding by sharpening scene details. We demonstrate that aligning a monocular affine-invariant depth prior with sparse metric measurements is a sufficient strategy to achieve domain-generalizable depth completion without relying on extensive training datasets.

Published

2025-04-11

How to Cite

Hyoseok, L., Kim, K. S., Byung-Ki, K., & Oh, T.-H. (2025). Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior. Proceedings of the AAAI Conference on Artificial Intelligence, 39(4), 3877–3885. https://doi.org/10.1609/aaai.v39i4.32405

Issue

Section

AAAI Technical Track on Computer Vision III