Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark

Authors

  • Jiahao Wang Tsinghua University
  • Xiangyu Cao Tsinghua University
  • Jiaru Zhong Hong Kong Polytechnic University
  • Yuner Zhang University of Pennsylvania
  • Zeyu Han Tsinghua University
  • Haibao Yu The University of Hong Kong
  • Chuang Zhang Tsinghua University
  • Lei He Tsinghua University
  • Shaobing Xu Tsinghua University
  • Jianqiang Wang Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v40i12.37951

Abstract

While cooperative perception can overcome the limitations of single-vehicle systems, the practical implementation of vehicle-to-vehicle and vehicle-to-infrastructure systems is often impeded by significant economic barriers. Aerial-ground cooperation (AGC), which pairs ground vehicles with drones, presents a more economically viable and rapidly deployable alternative. However, this emerging field has been held back by a critical lack of high-quality public datasets and benchmarks. To bridge this gap, we present Griffin, a comprehensive AGC 3D perception dataset, featuring over 250 dynamic scenes (37k+ frames). It incorporates varied drone altitudes (20-60m), diverse weather conditions, realistic drone dynamics via CARLA-AirSim co-simulation, and critical occlusion-aware 3D annotations. Accompanying the dataset is a unified benchmarking framework for cooperative detection and tracking, with protocols to evaluate communication efficiency, altitude adaptability, and robustness to communication latency, data loss and localization noise. By experiments through different cooperative paradigms, we demonstrate the effectiveness and limitations of current methods and provide crucial insights for future research.

Downloads

Published

2026-03-14

How to Cite

Wang, J., Cao, X., Zhong, J., Zhang, Y., Han, Z., Yu, H., … Wang, J. (2026). Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 9867–9875. https://doi.org/10.1609/aaai.v40i12.37951

Issue

Section

AAAI Technical Track on Computer Vision IX