MUTrack: A Memory-Aware Unified Representation Framework for Visual Tracking

Authors

  • Weijing Wu Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China
  • Qihua Liang Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China
  • Bineng Zhong Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China
  • Xiaohu Tang Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China
  • Yufei Tan Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China Guangxi Key Laboratory of Brain-inspired Computing and Intelligent Chips, School of Electronic and Information Engineering, Guangxi Normal University, Guilin, 541004, China
  • Ning Li Key Laboratory of Education Blockchain and Intelligent Technology Ministry of Education, Guangxi Normal University, Guilin 541004, China University Engineering Research Center of Educational Intelligent Technology, Guangxi Normal University, Guilin, 541004, China
  • Yuanliang Xue Xi’an Research Institute of High Technology, Xi’an 710025, China

DOI:

https://doi.org/10.1609/aaai.v40i13.38052

Abstract

Building a unified target representation that simultaneously achieves short-term adaptability and long-term stability is crucial for robust visual tracking. However, existing trackers typically face an inherent trade-off. Methods primarily relying on short-term appearance and motion cues achieve rapid adaptation, but they often struggle with long-term identity consistency. Conversely, trackers that emphasize extensive temporal context provide strong robustness, yet this approach can compromise their short-term adaptability. To bridge this gap, we propose a novel tracker, MUTrack, which comprehensively integrates both long-term and short-term memories into a unified target representation for more robust tracking. Specifically, we design a unified memory bank that stores and manages long-term memory for maintaining long-term identity consistency, and short-term memory for adapting to instantaneous appearance changes. To fully leverage the complementary nature of both long-term and short-term temporal information, we introduce a perception interaction module that dynamically fuses these memory types through deep and bidirectional interactions, enabling mutual refinement where one guides the other. This ultimately generates a highly adaptive target representation, which effectively balances adaptability to instantaneous changes with robustness against long-term identity drift. Extensive experiments on GOT10k, TrackingNet, LaSOT, LaSOT_ext, NfS, and OTB100 consistently demonstrate that MUTrack achieves SOTA performance.

Downloads

Published

2026-03-14

How to Cite

Wu, W., Liang, Q., Zhong, B., Tang, X., Tan, Y., Li, N., & Xue, Y. (2026). MUTrack: A Memory-Aware Unified Representation Framework for Visual Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 40(13), 10772–10780. https://doi.org/10.1609/aaai.v40i13.38052

Issue

Section

AAAI Technical Track on Computer Vision X