When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking

Authors

  • Weiran Li China Agricultural University National University of Singapore
  • Yeqiang Liu China Agricultural University
  • Qiannan Guo China Agricultural University
  • Yijie Wei China Agricultural University
  • Hwa Liang Leo National University of Singapore
  • Zhenbo Li China Agricultural University

DOI:

https://doi.org/10.1609/aaai.v40i8.37574

Abstract

Multiple object tracking (MOT) technology has made significant progress in terrestrial applications, but underwater tracking scenarios remain underexplored despite their importance to marine ecology and aquaculture. In this paper, we present Multiple Fish Tracking Dataset 2025 (MFT25), a comprehensive dataset specifically designed for underwater multiple fish tracking, featuring 15 diverse video sequences with 408,578 meticulously annotated bounding boxes across 48,066 frames. Our dataset captures various underwater environments, fish species, and challenging conditions including occlusions, similar appearances, and erratic motion patterns. Additionally, we introduce Scale-aware and Unscented Tracker (SU-T), a specialized tracking framework featuring an Unscented Kalman Filter (UKF) optimized for non-linear swimming patterns of fish and a novel Fish-Intersection-over-Union (FishIoU) matching that accounts for the unique morphological characteristics of aquatic species. Extensive experiments demonstrate that our SU-T baseline achieves state-of-the-art performance on MFT25, with 34.1 HOTA and 44.6 IDF1, while revealing fundamental differences between fish tracking and terrestrial object tracking scenarios.

Published

2026-03-14

How to Cite

Li, W., Liu, Y., Guo, Q., Wei, Y., Leo, H. L., & Li, Z. (2026). When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 40(8), 6459–6467. https://doi.org/10.1609/aaai.v40i8.37574

Issue

Section

AAAI Technical Track on Computer Vision V