TGFormer: Transformer with Track Query Group for Multi-Object Tracking
DOI:
https://doi.org/10.1609/aaai.v39i9.33065Abstract
Multi-object tracking faces a major challenge in handling the variations of tracked targets within complex scenes. In existing transformer-based tracking methods, typically each tracked target is only associated with one track query. However, trajectories in crowded scenes often experience varying levels of occlusion, making the association brittle for using a single track query to identify the tracked target. Therefore, we argue that relying on a single track query to track a target in complex scenes is inadequate. In this paper, we introduce TGFormer, with the core idea of designing a Track Query Group for each tracked target. Each group encompasses track queries that handle the same tracked target across different levels of occlusion scenes. To achieve long-term robust association, we propose a novel updater that integrates temporal memories and occlusion-aware features to update the Track Query Group, ensuring the tracked target can be consistently captured in complex scenes. Additionally, we introduce a Position Predictor that allows TGFormer to forecast motion trends, helping the model accurately locate moving tracklets. Experimental results show that our method achieves competitive performance on the MOT Challenge and DanceTrack datasets.Published
2025-04-11
How to Cite
Zeng, R., Huang, Y., & Pei, S. (2025). TGFormer: Transformer with Track Query Group for Multi-Object Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 39(9), 9824–9832. https://doi.org/10.1609/aaai.v39i9.33065
Issue
Section
AAAI Technical Track on Computer Vision VIII