SSCL: Adversarially Guided Image Compression via Semantic and Spectral Consistency Learning
DOI:
https://doi.org/10.1609/aaai.v40i17.38515Abstract
Perceptual image compression has recently gained increasing attention, as it aims to reconstruct visually realistic images using generative models. Most existing methods adopt patch-based generative adversarial networks (PatchGAN) for one-step image generation, where adversarial training helps the decoder learn the distribution of natural images. However, this strategy is often coarse-grained, as it focuses mainly on patch-level consistency and overlooks global structural and semantic details. To address this limitation, we propose a simple yet effective Semantic and Spectral Consistency Learning (SSCL) strategy, which complements existing patch-based approaches for more accurate distribution alignment. For semantic consistency, we leverage semantic vision models to extract semantic features. The semantic discriminator, aware of the specific semantics of each image, provides more adaptive and precise feedback. This guides the encoder to retain meaningful information and helps the decoder synthesize detailed textures, without requiring explicit semantic transmission or additional modules. For spectral consistency, we introduce a frequency discriminator that focuses on high-frequency components, helping to reduce artifacts based on spectral priors. Experiments show that SSCL outperforms existing perceptual codecs in terms of visual quality. Compared to MS-ILLM, SSCL achieves 45% to 60% bit-rate savings on CLIC2020 and Kodak datasets, measured by FID and DISTS.Downloads
Published
2026-03-14
How to Cite
Jiang, W., Zhai, Y., Yang, J., Feng, B., Wang, W., Huang, B., … Wang, R. (2026). SSCL: Adversarially Guided Image Compression via Semantic and Spectral Consistency Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(17), 14937–14945. https://doi.org/10.1609/aaai.v40i17.38515
Issue
Section
AAAI Technical Track on Data Mining & Knowledge Management I