Learned Bi-Resolution Image Coding using Generalized Octave Convolutions

Authors

  • Mohammad Akbari Simon Fraser University, Canada
  • Jie Liang Simon Fraser University, Canada
  • Jingning Han Google Inc., Mountain View
  • Chengjie Tu Tencent Technologies

DOI:

https://doi.org/10.1609/aaai.v35i8.16816

Keywords:

Neural Generative Models & Autoencoders, Data Compression, Image and Video Retrieval, Dimensionality Reduction/Feature Selection

Abstract

Learned image compression has recently shown the potential to outperform the standard codecs. State-of-the-art rate-distortion (R-D) performance has been achieved by context-adaptive entropy coding approaches in which hyperprior and autoregressive models are jointly utilized to effectively capture the spatial dependencies in the latent representations. However, the latents are feature maps of the same spatial resolution in previous works, which contain some redundancies that affect the R-D performance. In this paper, we propose a learned bi-resolution image coding approach that is based on the recently developed octave convolutions to factorize the latents into high and low resolution components. Therefore, the spatial redundancy is reduced, which improves the R-D performance. Novel generalized octave convolution and octave transposed-convolution architectures with internal activation layers are also proposed to preserve more spatial structure of the information. Experimental results show that the proposed scheme outperforms all existing learned methods as well as standard codecs such as the next-generation video coding standard VVC (4:2:0) in both PSNR and MS-SSIM. We also show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based schemes such as semantic segmentation and image denoising.

Downloads

Published

2021-05-18

How to Cite

Akbari, M., Liang, J., Han, J., & Tu, C. (2021). Learned Bi-Resolution Image Coding using Generalized Octave Convolutions. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6592-6599. https://doi.org/10.1609/aaai.v35i8.16816

Issue

Section

AAAI Technical Track on Machine Learning I