ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance

Authors

  • Yucheng Zhao Beijing University of Technology
  • Gengyu Lyu Beijing University of Technology
  • Ke Li Beijing University of Technology
  • Zihao Wang Beijing University of Technology
  • Hao Chen Southeast University
  • Zhen Yang Beijing University of Technology
  • Yongjian Deng Beijing University of Technology

DOI:

https://doi.org/10.1609/aaai.v39i10.33141

Abstract

Event-based semantic segmentation (ESS) has attracted researchers' attention recently, as event cameras can solve problems such as under/over-exposure or motion blur that are difficult for RGB cameras to handle. However, event data are noisy and sparse, resulting in difficulties for the model to locate and extract reliable cues from their sparse representations, especially when performing pixel-level tasks. In this paper, we propose a novel framework ESEG to alleviate the dilemma. Given that event signals relate closely to moving edges, instead of proposing complex structures to expect them to recognize those reliable edge regions behind event signals on their own, we introduce the explicit edge-semantic supervision as a reference to let the ESS model globally optimize semantics, considering the high confidence of event data in edge regions. In addition, we propose a fusion module named Density-Aware Dynamic-Window Cross Attention Fusion (D\textsuperscript{2}CAF), in which the density perception, cross-attention, and dynamic window masking mechanisms are jointly imposed to optimize edge-dense feature fusion, leveraging the characteristics of event cameras. Experimental results on DSEC and DDD17 datasets demonstrate the efficacy of the ESEG framework and its core designs.

Downloads

Published

2025-04-11

How to Cite

Zhao, Y., Lyu, G., Li, K., Wang, Z., Chen, H., Yang, Z., & Deng, Y. (2025). ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance. Proceedings of the AAAI Conference on Artificial Intelligence, 39(10), 10510-10518. https://doi.org/10.1609/aaai.v39i10.33141

Issue

Section

AAAI Technical Track on Computer Vision IX