GLoMOT: Efficient Online GNN-based Low-Frame-Rate Multi-Object Tracker

Authors

  • Yaxuan Hu Wuhan University
  • Jie Hua Wuhan University
  • Gang Wu Tarim University
  • Yuhong Yang Wuhan University
  • Atsushi Suzuki The University of Hong Kong
  • Zhongyuan Wang Wuhan University

DOI:

https://doi.org/10.1609/aaai.v40i6.42500

Abstract

Low-frame-rate (LFR) Multi-Object Tracking (MOT) is crucial for efficient tracking on edge devices, as it significantly reduces computational and storage demands. However, existing trackers struggle in LFR settings due to large temporal gaps, extreme appearance changes, and motion non-linearity. While Graph Neural Network (GNN)-based trackers are effective at associating objects across these gaps, most operate offline, which prevents their use for online tracking. To address these limitations, we propose GLoMOT, a novel online GNN-based Low-Frame-Rate Multi-Object Tracker designed for robust performance in LFR videos. To bridge the large temporal gaps, we introduce a Dynamic Node Buffer Pool. This acts as a long-term memory, caching the states of absent objects to enable their robust re-association. To tackle extreme motion uncertainty, we propose an adaptive context-aware module that dynamically adjusts the weights of positional and appearance features, generating more robust features for predicting node connections. Furthermore, we propose a pseudo-depth feature calculation method. This provides the GNN with critical geometric context, which helps resolve spatial ambiguity arising from occlusions. Extensive experiments on several public MOT benchmarks, including DanceTrack, MOT17, and VisDrone, demonstrate GLoMOT's effectiveness and superiority, particularly in challenging Low-Frame-Rate conditions.

Downloads

Published

2026-03-14

How to Cite

Hu, Y., Hua, J., Wu, G., Yang, Y., Suzuki, A., & Wang, Z. (2026). GLoMOT: Efficient Online GNN-based Low-Frame-Rate Multi-Object Tracker. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4959–4967. https://doi.org/10.1609/aaai.v40i6.42500

Issue

Section

AAAI Technical Track on Computer Vision III