Efficient Lightweight Image Denoising with Triple Attention Transformer

Authors

  • Yubo Zhou Xiamen University, Fujian, China
  • Jin Lin Xiamen University, Fujian, China
  • Fangchen Ye Xiamen University, Fujian, China
  • Yanyun Qu Xiamen University, Fujian, China
  • Yuan Xie East China Normal University, Shanghai, China

DOI:

https://doi.org/10.1609/aaai.v38i7.28604

Keywords:

CV: Low Level & Physics-based Vision, CV: Learning & Optimization for CV

Abstract

Transformer has shown outstanding performance on image denoising, but the existing Transformer methods for image denoising are with large model sizes and high computational complexity, which is unfriendly to resource-constrained devices. In this paper, we propose a Lightweight Image Denoising Transformer method (LIDFormer) based on Triple Multi-Dconv Head Transposed Attention (TMDTA) to boost computational efficiency. LIDFormer first implements Discrete Wavelet Transform (DWT), which transforms the input image into a low-frequency space, greatly reducing the computational complexity of image denoising. However, the low-frequency image lacks fine-feature information, which degrades the denoising performance. To handle this problem, we introduce the Complementary Periodic Feature Reusing (CPFR) scheme for aggregating the shallow-layer features and the deep-layer features. Furthermore, TMDTA is proposed to integrate global context along three dimensions, thereby enhancing the ability of global feature representation. Note that our method can be applied as a pipeline for both convolutional neural networks and Transformers. Extensive experiments on several benchmarks demonstrate that the proposed LIDFormer achieves a better trade-off between high performance and low computational complexity on real-world image denoising tasks.

Published

2024-03-24

How to Cite

Zhou, Y., Lin, J., Ye, F., Qu, Y., & Xie, Y. (2024). Efficient Lightweight Image Denoising with Triple Attention Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7704-7712. https://doi.org/10.1609/aaai.v38i7.28604

Issue

Section

AAAI Technical Track on Computer Vision VI