E-MaT:Event-oriented Mamba for Egocentric Point Tracking

Authors

  • Han Han University of Science and Technology of China
  • Wei Zhai University of Science and Technology of China
  • Baocai Yin iFLYTEK Research
  • Yang Cao University of Science and Technology of China
  • Bin Li University of Science and Technology of China
  • Zheng-jun Zha University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i6.42454

Abstract

Egocentric point tracking aims to localize points on object surfaces from a first-person perspective and serves as a critical step toward embodied intelligence. Recent methods rely on video input, tracking query points through feature matching across consecutive frames. However, these methods struggle in highly dynamic settings—a common challenge in first-person perspectives, where the head-mounted camera undergoes frequent and abrupt rotations, resulting in high angular velocities, motion blur, and large inter-frame displacements. In contrast, event cameras capture motion at microsecond temporal resolution, naturally avoiding blur and delivering low-latency, high-fidelity cues crucial for egocentric point tracking. Moreover, rapid egocentric motion disrupts local smoothness, breaking the assumption that spatially adjacent regions share similar motion. Event dynamics expose global motion trends, guiding coherent modeling and consistent feature flow. Therefore, this paper proposes a mamba-based tracking framework that constructs feature modeling paths aligned with the dominant motion trend extracted from events, and modulates feature propagation along these paths based on local motion intensity, enhancing stability by suppressing unreliable signals and emphasizing consistent cues. Additionally, a motion-adaptive suppression module enhances temporal robustness by adaptively suppressing correlation features based on motion intensity variations, mitigating the effects of intensity fluctuations and partial observability. To facilitate research in this domain, a multimodal dataset named DVS-EgoPoints with both events and videos for egocentric point tracking is collected. Experiments on the DVS-EgoPoints dataset and a simulation benchmark demonstrate superior performance over state-of-the-art methods, especially under challenging motion and occlusion conditions.

Downloads

Published

2026-03-14

How to Cite

Han, H., Zhai, W., Yin, B., Cao, Y., Li, B., & Zha, Z.- jun. (2026). E-MaT:Event-oriented Mamba for Egocentric Point Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4547–4555. https://doi.org/10.1609/aaai.v40i6.42454

Issue

Section

AAAI Technical Track on Computer Vision III