Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging

Authors

  • Fulin Luo College of Computer Science, Chongqing University
  • Xi Chen College of Computer Science, Chongqing University
  • Xiuwen Gong Faculty of Engineering, The University of Sydney
  • Weiwen Wu Department of Biomedical Engineering, Sun-Yat-sen University
  • Tan Guo School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v38i4.28190

Keywords:

CV: Computational Photography, Image & Video Synthesis

Abstract

Coded aperture snapshot spectral imaging (CASSI) system is an effective manner for hyperspectral snapshot compressive imaging. The core issue of CASSI is to solve the inverse problem for the reconstruction of hyperspectral image (HSI). In recent years, Transformer-based methods achieve promising performance in HSI reconstruction. However, capturing both long-range dependencies and local information while ensuring reasonable computational costs remains a challenging problem. In this paper, we propose a Transformer-based HSI reconstruction method called dual-window multiscale Transformer (DWMT), which is a coarse-to-fine process, reconstructing the global properties of HSI with the long-range dependencies. In our method, we propose a novel U-Net architecture using a dual-branch encoder to refine pixel information and full-scale skip connections to fuse different features, enhancing the extraction of fine-grained features. Meanwhile, we design a novel self-attention mechanism called dual-window multiscale multi-head self-attention (DWM-MSA), which utilizes two different-sized windows to compute self-attention, which can capture the long-range dependencies in a local region at different scales to improve the reconstruction performance. We also propose a novel position embedding method for Transformer, named con-abs position embedding (CAPE), which effectively enhances positional information of the HSIs. Extensive experiments on both the simulated and the real data are conducted to demonstrate the superior performance, stability, and generalization ability of our DWMT. Code of this project is at https://github.com/chenx2000/DWMT.

Published

2024-03-24

How to Cite

Luo, F., Chen, X., Gong, X., Wu, W., & Guo, T. (2024). Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3972-3980. https://doi.org/10.1609/aaai.v38i4.28190

Issue

Section

AAAI Technical Track on Computer Vision III