Motion-adaptive Transformer for Event-based Image Deblurring

Authors

  • Senyan Xu University of Science and Technology of China
  • Zhijing Sun University of Science and Technology of China
  • Mingchen Zhong University of Science and Technology of China
  • Chengzhi Cao University of Science and Technology of China
  • Yidi Liu University of Science and Technology of China
  • Xueyang Fu University of Science and Technology of China
  • Yan Chen University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v39i9.32967

Abstract

Event cameras, which capture pixel-level brightness changes asynchronously, provide rich motion information that is often missed during traditional frame-based camera exposures, thereby offering fresh perspectives for motion deblurring. Although current approaches incorporate event intensity, they neglect essential spatial motion information. Unlike their CNN architectures, Transformers excel in modeling long-range dependencies but struggle with establishing relevant non-local connections in sparse events and fail to highlight significant interactions in dense images. To address these limitations, we introduce a Motion-Adaptive Transformer network (MAT) that utilizes spatial motion information to forge robust global connections. The core design is an Adaptive Motion Mask Predictor (AMMP) that identifies key motion regions, guiding the Motion-Sparse Attention (MSA) to eliminate irrelevant event tokens and enabling the Motion-Aware Attention (MAA) to focus on relevant ones, thereby enhancing long-range dependency modeling. Additionally, we elaborately design a Cross-Modal Intensity Gating mechanism that efficiently merges intensity data across modalities while minimizing parameter use. The learnable Expansion-Controlled Spatial Gating further optimizes the transmission of event features. Comprehensive testing confirms that our approach sets a new benchmark in image deblurring, surpassing previous methods by up to 0.60dB on the GoPro dataset, 1.04dB on the HS-ERGB dataset, and achieving an average improvement of 0.52dB across two real-world datasets.

Downloads

Published

2025-04-11

How to Cite

Xu, S., Sun, Z., Zhong, M., Cao, C., Liu, Y., Fu, X., & Chen, Y. (2025). Motion-adaptive Transformer for Event-based Image Deblurring. Proceedings of the AAAI Conference on Artificial Intelligence, 39(9), 8942-8950. https://doi.org/10.1609/aaai.v39i9.32967

Issue

Section

AAAI Technical Track on Computer Vision VIII