G-IR: Geometric Image Representation for Learning

Xin Chen; Qi Zhao; Wei Zeng; Zongben Xu

doi:10.1609/aaai.v40i24.39119

Authors

Xin Chen Xi'an Jiaotong University
Qi Zhao Xi'an Jiaotong University
Wei Zeng Xi'an Jiaotong University
Zongben Xu Xi'an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v40i24.39119

Abstract

Images are generally represented by pixel intensities or color values, which are usually used as direct inputs for learning. This study innovatively proposes a geometric image representation method and refreshes the general learning model (e.g., autoencoder) in the diffeomorphic space. Based on the theory of geometric optimal transport and quasiconformal mapping, we equivalently transform the intensity representation into a shape representation. The image space becomes a diffeomorphic space, where any image can be uniquely represented as a Beltrami coefficient function defined on a uniform grid reference, and vice versa. This innovative geometric image representation (G-IR) captures the fine-grained structure inherent in the entire image, which is different from the traditional feature extraction that focuses on the internal geometric objects of the image (such as boundaries and axes). The diffeomorphic property preserves structure in the generation process, which is very necessary in the field of real physics. It can be assembled into existing pipelines as a plug-in, providing structure-preserving properties for the entire framework. Experiments on image restoration and interpolation validated the high efficiency, efficacy and applicability of the G-IR method, demonstrating its superior performance compared to common pixel-level image appearance representations.

G-IR: Geometric Image Representation for Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information