DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

Mingxin Yi; Kai Zhang; Pei Liu; Tanli Zuo; Jingduo Tian

doi:10.1609/aaai.v38i7.28494

Authors

Mingxin Yi Tsinghua Shenzhen International Graduate School, Tsinghua University, China
Kai Zhang Tsinghua Shenzhen International Graduate School, Tsinghua University, China Research Institute of Tsinghua, Pearl River Delta
Pei Liu Media Technology Lab, Huawei, China
Tanli Zuo Media Technology Lab, Huawei, China
Jingduo Tian Media Technology Lab, Huawei, China

DOI:

https://doi.org/10.1609/aaai.v38i7.28494

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Low Level & Physics-based Vision, CV: Large Vision Models, CV: Applications

Abstract

Deriving DSLR-quality sRGB images from smartphone RAW images has become a compelling challenge due to discernible detail disparity, color mapping instability, and spatial misalignment in RAW-sRGB data pairs. We present DiffRAW, a novel method that incorporates the diffusion model for the first time in learning RAW-to-sRGB mappings. By leveraging the diffusion model, our approach effectively learns the high-quality detail distribution of DSLR images, thereby enhancing the details of output images. Simultaneously, we use the RAW image as a diffusion condition to maintain image structure information such as contours and textures. To mitigate the interference caused by the color and spatial misalignment in training data pairs, we embed a color-position preserving condition within DiffRAW, ensuring that the output images do not exhibit color biases and pixel shift issues. To accelerate the inference process of DiffRAW, we designed the Domain Transform Diffusion Method, an efficient diffusion process with its corresponding reverse process. The Domain Transform Diffusion Method can reduce the required inference steps for diffusion model-based image restoration/enhancement algorithms while enhancing the quality of the generated images. Through evaluations on the ZRR dataset, DiffRAW consistently demonstrates state-of-the-art performance across all perceptual quality metrics (e.g., LPIPS, FID, MUSIQ), while achieving comparable results in PSNR and SSIM.

DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription