FEditNet: Few-Shot Editing of Latent Semantics in GAN Spaces

Authors

  • Mengfei Xia Tsinghua University
  • Yezhi Shu Tsinghua University
  • Yuji Wang Tsinghua University
  • Yu-Kun Lai Cardiff University
  • Qiang Li Kuaishou Technology
  • Pengfei Wan Kuaishou Technology
  • Zhongyuan Wang Kuaishou Technology
  • Yong-Jin Liu Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v37i3.25394

Keywords:

CV: Computational Photography, Image & Video Synthesis

Abstract

Generative Adversarial networks (GANs) have demonstrated their powerful capability of synthesizing high-resolution images, and great efforts have been made to interpret the semantics in the latent spaces of GANs. However, existing works still have the following limitations: (1) the majority of works rely on either pretrained attribute predictors or large-scale labeled datasets, which are difficult to collect in most cases, and (2) some other methods are only suitable for restricted cases, such as focusing on interpretation of human facial images using prior facial semantics. In this paper, we propose a GAN-based method called FEditNet, aiming to discover latent semantics using very few labeled data without any pretrained predictors or prior knowledge. Specifically, we reuse the knowledge from the pretrained GANs, and by doing so, avoid overfitting during the few-shot training of FEditNet. Moreover, our layer-wise objectives which take content consistency into account also ensure the disentanglement between attributes. Qualitative and quantitative results demonstrate that our method outperforms the state-of-the-art methods on various datasets. The code is available at https://github.com/THU-LYJ-Lab/FEditNet.

Downloads

Published

2023-06-26

How to Cite

Xia, M., Shu, Y., Wang, Y., Lai, Y.-K., Li, Q., Wan, P., Wang, Z., & Liu, Y.-J. (2023). FEditNet: Few-Shot Editing of Latent Semantics in GAN Spaces. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 2919-2927. https://doi.org/10.1609/aaai.v37i3.25394

Issue

Section

AAAI Technical Track on Computer Vision III