Walking Further: Semantic-Aware Multimodal Gait Recognition Under Long-Range Conditions

Authors

  • Zhiyang Lu Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University
  • Wen Jiang Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University
  • Tianren Wu Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University
  • Zhichao Wang Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University
  • Changwang Zhang OPPO Research Institute
  • Siqi Shen Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University
  • Ming Cheng Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University

DOI:

https://doi.org/10.1609/aaai.v40i9.37703

Abstract

Gait recognition is an emerging biometric technology that enables non-intrusive and hard-to-spoof human identification. However, most existing methods are confined to short-range, unimodal settings and fail to generalize to long-range and cross-distance scenarios under real-world conditions. To address this gap, we present LRGait, the first LiDAR-Camera multimodal benchmark designed for robust long-range gait recognition across diverse outdoor distances and environments. We further propose EMGaitNet, an end-to-end framework tailored for long-range multimodal gait recognition. To bridge the modality gap between RGB images and point clouds, we introduce a semantic-guided fusion pipeline. A CLIP-based Semantic Mining (SeMi) module first extracts human body-part-aware semantic cues, which are then employed to align 2D and 3D features via a Semantic-Guided Alignment (SGA) module within a unified embedding space. A Symmetric Cross-Attention Fusion (SCAF) module hierarchically integrates visual contours and 3D geometric features, and a Spatio-Temporal (ST) module captures global gait dynamics. Extensive experiments on various gait datasets validate the effectiveness of our method.

Downloads

Published

2026-03-14

How to Cite

Lu, Z., Jiang, W., Wu, T., Wang, Z., Zhang, C., Shen, S., & Cheng, M. (2026). Walking Further: Semantic-Aware Multimodal Gait Recognition Under Long-Range Conditions. Proceedings of the AAAI Conference on Artificial Intelligence, 40(9), 7618–7626. https://doi.org/10.1609/aaai.v40i9.37703

Issue

Section

AAAI Technical Track on Computer Vision VI