GUIDE: Gaussian Unified Instance Detection for Enhanced Obstacle Perception in Autonomous Driving

Authors

  • Chunyong Hu Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group
  • Qi Luo Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group
  • Jianyun Xu Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group
  • Song Wang Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group Zhejiang University
  • Qiang Li Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group
  • Sheng Yang Unmanned Vehicle Dept., CaiNiao Inc., Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v40i6.42484

Abstract

In the realm of autonomous driving, accurately detecting surrounding obstacles is crucial for effective decision-making. Traditional methods primarily rely on 3D bounding boxes to represent these obstacles, which often fail to capture the complexity of irregularly shaped, real-world objects. To overcome these limitations, we present GUIDE, a novel framework that utilizes 3D Gaussians for instance detection and occupancy prediction. Unlike conventional occupancy prediction methods, GUIDE also offers robust tracking capabilities. Our framework employs a sparse representation strategy, using Gaussian-to-Voxel Splatting to provide fine-grained, instance-level occupancy data without the computational demands associated with dense voxel grids. Experimental validation on the nuScenes dataset demonstrates GUIDE's performance, with an instance occupancy mAP of 21.61, marking a 50% improvement over existing methods, alongside competitive tracking capabilities. GUIDE establishes a new benchmark in autonomous perception systems, effectively combining precision with computational efficiency to better address the complexities of real-world driving environments.

Downloads

Published

2026-03-14

How to Cite

Hu, C., Luo, Q., Xu, J., Wang, S., Li, Q., & Yang, S. (2026). GUIDE: Gaussian Unified Instance Detection for Enhanced Obstacle Perception in Autonomous Driving. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4816–4824. https://doi.org/10.1609/aaai.v40i6.42484

Issue

Section

AAAI Technical Track on Computer Vision III