Motion-adaptive Transformer for Event-based Image Deblurring
DOI:
https://doi.org/10.1609/aaai.v39i9.32967Abstract
Event cameras, which capture pixel-level brightness changes asynchronously, provide rich motion information that is often missed during traditional frame-based camera exposures, thereby offering fresh perspectives for motion deblurring. Although current approaches incorporate event intensity, they neglect essential spatial motion information. Unlike their CNN architectures, Transformers excel in modeling long-range dependencies but struggle with establishing relevant non-local connections in sparse events and fail to highlight significant interactions in dense images. To address these limitations, we introduce a Motion-Adaptive Transformer network (MAT) that utilizes spatial motion information to forge robust global connections. The core design is an Adaptive Motion Mask Predictor (AMMP) that identifies key motion regions, guiding the Motion-Sparse Attention (MSA) to eliminate irrelevant event tokens and enabling the Motion-Aware Attention (MAA) to focus on relevant ones, thereby enhancing long-range dependency modeling. Additionally, we elaborately design a Cross-Modal Intensity Gating mechanism that efficiently merges intensity data across modalities while minimizing parameter use. The learnable Expansion-Controlled Spatial Gating further optimizes the transmission of event features. Comprehensive testing confirms that our approach sets a new benchmark in image deblurring, surpassing previous methods by up to 0.60dB on the GoPro dataset, 1.04dB on the HS-ERGB dataset, and achieving an average improvement of 0.52dB across two real-world datasets.Downloads
Published
2025-04-11
How to Cite
Xu, S., Sun, Z., Zhong, M., Cao, C., Liu, Y., Fu, X., & Chen, Y. (2025). Motion-adaptive Transformer for Event-based Image Deblurring. Proceedings of the AAAI Conference on Artificial Intelligence, 39(9), 8942-8950. https://doi.org/10.1609/aaai.v39i9.32967
Issue
Section
AAAI Technical Track on Computer Vision VIII