SSCL: Adversarially Guided Image Compression via Semantic and Spectral Consistency Learning

Wei Jiang; Yongqi Zhai; Jiayu Yang; Bohao Feng; Wenqiang Wang; Bo Huang; Lin Ding; Ronggang Wang

doi:10.1609/aaai.v40i17.38515

Authors

Wei Jiang Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
Yongqi Zhai Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
Jiayu Yang Pengcheng Laboratory
Bohao Feng Alibaba Cloud Computing
Wenqiang Wang Alibaba Cloud Computing
Bo Huang Alibaba Cloud Computing
Lin Ding Alibaba Cloud Computing
Ronggang Wang Shenzhen Graduate School, Peking University Pengcheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v40i17.38515

Abstract

Perceptual image compression has recently gained increasing attention, as it aims to reconstruct visually realistic images using generative models. Most existing methods adopt patch-based generative adversarial networks (PatchGAN) for one-step image generation, where adversarial training helps the decoder learn the distribution of natural images. However, this strategy is often coarse-grained, as it focuses mainly on patch-level consistency and overlooks global structural and semantic details. To address this limitation, we propose a simple yet effective Semantic and Spectral Consistency Learning (SSCL) strategy, which complements existing patch-based approaches for more accurate distribution alignment. For semantic consistency, we leverage semantic vision models to extract semantic features. The semantic discriminator, aware of the specific semantics of each image, provides more adaptive and precise feedback. This guides the encoder to retain meaningful information and helps the decoder synthesize detailed textures, without requiring explicit semantic transmission or additional modules. For spectral consistency, we introduce a frequency discriminator that focuses on high-frequency components, helping to reduce artifacts based on spectral priors. Experiments show that SSCL outperforms existing perceptual codecs in terms of visual quality. Compared to MS-ILLM, SSCL achieves 45% to 60% bit-rate savings on CLIC2020 and Kodak datasets, measured by FID and DISTS.

SSCL: Adversarially Guided Image Compression via Semantic and Spectral Consistency Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information