DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

Authors

  • Mingxin Yi Tsinghua Shenzhen International Graduate School, Tsinghua University, China
  • Kai Zhang Tsinghua Shenzhen International Graduate School, Tsinghua University, China Research Institute of Tsinghua, Pearl River Delta
  • Pei Liu Media Technology Lab, Huawei, China
  • Tanli Zuo Media Technology Lab, Huawei, China
  • Jingduo Tian Media Technology Lab, Huawei, China

DOI:

https://doi.org/10.1609/aaai.v38i7.28494

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Low Level & Physics-based Vision, CV: Large Vision Models, CV: Applications

Abstract

Deriving DSLR-quality sRGB images from smartphone RAW images has become a compelling challenge due to discernible detail disparity, color mapping instability, and spatial misalignment in RAW-sRGB data pairs. We present DiffRAW, a novel method that incorporates the diffusion model for the first time in learning RAW-to-sRGB mappings. By leveraging the diffusion model, our approach effectively learns the high-quality detail distribution of DSLR images, thereby enhancing the details of output images. Simultaneously, we use the RAW image as a diffusion condition to maintain image structure information such as contours and textures. To mitigate the interference caused by the color and spatial misalignment in training data pairs, we embed a color-position preserving condition within DiffRAW, ensuring that the output images do not exhibit color biases and pixel shift issues. To accelerate the inference process of DiffRAW, we designed the Domain Transform Diffusion Method, an efficient diffusion process with its corresponding reverse process. The Domain Transform Diffusion Method can reduce the required inference steps for diffusion model-based image restoration/enhancement algorithms while enhancing the quality of the generated images. Through evaluations on the ZRR dataset, DiffRAW consistently demonstrates state-of-the-art performance across all perceptual quality metrics (e.g., LPIPS, FID, MUSIQ), while achieving comparable results in PSNR and SSIM.

Published

2024-03-24

How to Cite

Yi, M., Zhang, K., Liu, P., Zuo, T., & Tian, J. (2024). DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 6711-6719. https://doi.org/10.1609/aaai.v38i7.28494

Issue

Section

AAAI Technical Track on Computer Vision VI