Query Quantized Neural SLAM

Authors

  • Sijia Jiang Wayne State University
  • Jing Hua Wayne State University
  • Zhizhong Han Wayne State University

DOI:

https://doi.org/10.1609/aaai.v39i4.32425

Abstract

Neural implicit representations have shown remarkable abilities in jointly modeling geometry, color, and camera poses in simultaneous localization and mapping (SLAM). Current methods use coordinates, positional encodings, or other geometry features as input to query neural implicit functions for signed distances and color which produce rendering errors to drive the optimization in overfitting image observations. However, due to the run time efficiency requirement in SLAM systems, we are merely allowed to conduct optimization on each frame in few iterations, which is far from enough for neural networks to overfit these queries. The underfitting usually results in severe drifts in camera tracking and artifacts in reconstruction. To resolve this issue, we propose query quantized neural SLAM which uses quantized queries to reduce variations of input for much easier and faster overfitting a frame. To this end, we quantize a query into a discrete representation with a set of codes, and only allow neural networks to observe a finite number of variations. This allows neural networks to become increasingly familiar with these codes after overfitting more and more previous frames. Moreover, we also introduce novel initialization, losses, and argumentation to stabilize the optimization with significant uncertainty in the early optimization stage, constrain the optimization space, and estimate camera poses more accurately. We justify the effectiveness of each design and report visual and numerical comparisons on widely used benchmarks to show our superiority over the latest methods in both reconstruction and camera tracking.

Downloads

Published

2025-04-11

How to Cite

Jiang, S., Hua, J., & Han, Z. (2025). Query Quantized Neural SLAM. Proceedings of the AAAI Conference on Artificial Intelligence, 39(4), 4057–4065. https://doi.org/10.1609/aaai.v39i4.32425

Issue

Section

AAAI Technical Track on Computer Vision III