Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single Image Denoising
DOI:
https://doi.org/10.1609/aaai.v39i5.32500Abstract
Many studies have concentrated on constructing supervised models utilizing paired datasets for image denoising, which proves to be expensive and time-consuming. Current self-supervised and unsupervised approaches typically rely on blind-spot networks or sub-image pairs sampling, resulting in pixel information loss and destruction of detailed structural information, thereby significantly constraining the efficacy of such methods. In this paper, we introduce Prompt-SID, a prompt-learning-based single image denoising framework that emphasizes the preservation of structural details. This approach is trained in a self-supervised manner using downsampled image pairs. It captures original-scale image information through structural encoding and integrates this prompt into the denoiser. To achieve this, we propose a structural representation generation model based on the latent diffusion process and design a structural attention module within the transformer-based denoiser architecture to decode the prompt. Additionally, we introduce a scale replay training mechanism, which effectively mitigates the scale gap from images of different resolutions. We conduct comprehensive experiments on synthetic, real-world, and fluorescence imaging datasets, showcasing the remarkable effectiveness of Prompt-SID.Published
2025-04-11
How to Cite
Li, H., Zhang, W., Hu, X., Jiang, T., Chen, Z., & Wang, H. (2025). Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single Image Denoising. Proceedings of the AAAI Conference on Artificial Intelligence, 39(5), 4734–4742. https://doi.org/10.1609/aaai.v39i5.32500
Issue
Section
AAAI Technical Track on Computer Vision IV