Debiased Multiplex Tokenizer for Efficient Map-Free Visual Relocalization

Authors

  • Wenshuai Wang Peking University Pengcheng Laboratory
  • Hong Liu Peking University
  • Shengquan Li Pengcheng Laboratory
  • Peifeng Jiang Peking University
  • Runwei Ding Pengcheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v40i12.37982

Abstract

Image-based feature representation plays a critical role in visual localization, enabling robots to estimate their position and orientation in GPS-denied environments. However, this task is often undermined by significant variations in camera viewpoints and scene appearances. Recently, map-free visual relocalization (MFVR) has emerged as a promising paradigm due to its compatibility with lightweight deployment and privacy isolation on mobile devices. In this paper, we propose the Debiased Multiplex Tokenizer (DeMT) as a novel method for versatile and efficient MFVR. Specifically, DeMT performs relative pose regression through an integrated framework built upon a pretrained vision Mamba encoder, comprising three key modules: First, Multiplex Interactive Tokenization yields robust image tokens with non-local affinities and cross-domain descriptions; Second, Debiased Anchor Registration facilitates anchor token matching through proximity graph retrieval and causal pointer attribution; Third, Geometry-Informed Pose Regression empowers multi-layer perceptrons with a gating mechanism and spectral normalization to support both pair-wise and multi-view modes. Extensive evaluations across nine public datasets demonstrate that DeMT substantially outperforms existing baselines and ablation variants in diverse indoor and outdoor environments.

Downloads

Published

2026-03-14

How to Cite

Wang, W., Liu, H., Li, S., Jiang, P., & Ding, R. (2026). Debiased Multiplex Tokenizer for Efficient Map-Free Visual Relocalization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 10145–10153. https://doi.org/10.1609/aaai.v40i12.37982

Issue

Section

AAAI Technical Track on Computer Vision IX