Motion Deblurring via Spatial-Temporal Collaboration of Frames and Events
DOI:
https://doi.org/10.1609/aaai.v38i7.28474Keywords:
CV: Computational Photography, Image & Video Synthesis, CV: Low Level & Physics-based Vision, CV: Multi-modal VisionAbstract
Motion deblurring can be advanced by exploiting informative features from supplementary sensors such as event cameras, which can capture rich motion information asynchronously with high temporal resolution. Existing event-based motion deblurring methods neither consider the modality redundancy in spatial fusion nor temporal cooperation between events and frames. To tackle these limitations, a novel spatial-temporal collaboration network (STCNet) is proposed for event-based motion deblurring. Firstly, we propose a differential-modality based cross-modal calibration strategy to suppress redundancy for complementarity enhancement, and then bimodal spatial fusion is achieved with an elaborate cross-modal co-attention mechanism to weight the contributions of them for importance balance. Besides, we present a frame-event mutual spatio-temporal attention scheme to alleviate the errors of relying only on frames to compute cross-temporal similarities when the motion blur is significant, and then the spatio-temporal features from both frames and events are aggregated with the custom cross-temporal coordinate attention. Extensive experiments on both synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance. Project website: https://github.com/wyang-vis/STCNet.Downloads
Published
2024-03-24
How to Cite
Yang, W., Wu, J., Ma, J., Li, L., & Shi, G. (2024). Motion Deblurring via Spatial-Temporal Collaboration of Frames and Events. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 6531-6539. https://doi.org/10.1609/aaai.v38i7.28474
Issue
Section
AAAI Technical Track on Computer Vision VI