SEIT: Structural Enhancement for Unsupervised Image Translation in Frequency Domain

Authors

  • Zhifeng Zhu School of Software Engineering, Xi’an Jiaotong University
  • Yaochen Li School of Software Engineering, Xi’an Jiaotong University
  • Yifan Li School of Software Engineering, Xi’an Jiaotong University
  • Jinhuo Yang School of Software Engineering, Xi’an Jiaotong University
  • Peijun Chen School of Software Engineering, Xi’an Jiaotong University
  • Yuehu Liu Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v38i7.28617

Keywords:

CV: Vision for Robotics & Autonomous Driving, CV: Computational Photography, Image & Video Synthesis, CV: Other Foundations of Computer Vision

Abstract

For the task of unsupervised image translation, transforming the image style while preserving its original structure remains challenging. In this paper, we propose an unsupervised image translation method with structural enhancement in frequency domain named SEIT. Specifically, a frequency dynamic adaptive (FDA) module is designed for image style transformation that can well transfer the image style while maintaining its overall structure by decoupling the image content and style in frequency domain. Moreover, a wavelet-based structure enhancement (WSE) module is proposed to improve the intermediate translation results by matching the high-frequency information, thus enriching the structural details. Furthermore, a multi-scale network architecture is designed to extract the domain-specific information using image-independent encoders for both the source and target domains. The extensive experimental results well demonstrate the effectiveness of the proposed method.

Published

2024-03-24

How to Cite

Zhu, Z., Li, Y., Li, Y., Yang, J., Chen, P., & Liu, Y. (2024). SEIT: Structural Enhancement for Unsupervised Image Translation in Frequency Domain. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7820–7827. https://doi.org/10.1609/aaai.v38i7.28617

Issue

Section

AAAI Technical Track on Computer Vision VI