DeepCalliFont: Few-Shot Chinese Calligraphy Font Synthesis by Integrating Dual-Modality Generative Models

Authors

  • Yitian Liu Wangxuan Institute of Computer Technology, Peking University, Beijing, P.R. China
  • Zhouhui Lian Wangxuan Institute of Computer Technology, Peking University, Beijing, P.R. China

DOI:

https://doi.org/10.1609/aaai.v38i4.28168

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Applications

Abstract

Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. With the help of prior knowledge that is mainly based on glyph consistency assumptions, some recently proposed methods can synthesize high-quality Chinese glyph images. However, glyphs in calligraphy font styles often do not meet these assumptions. To address this problem, we propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models. Specifically, the proposed model consists of image synthesis and sequence generation branches, generating consistent results via a dual-modality representation learning strategy. The two modalities (i.e., glyph images and writing sequences) are properly integrated using a feature recombination module and a rasterization loss function. Furthermore, a new pre-training strategy is adopted to improve the performance by exploiting large amounts of uni-modality data. Both qualitative and quantitative experiments have been conducted to demonstrate the superiority of our method to other state-of-the-art approaches in the task of few-shot Chinese calligraphy font synthesis. The source code can be found at https://github.com/lsflyt-pku/DeepCalliFont.

Published

2024-03-24

How to Cite

Liu, Y., & Lian, Z. (2024). DeepCalliFont: Few-Shot Chinese Calligraphy Font Synthesis by Integrating Dual-Modality Generative Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3774-3782. https://doi.org/10.1609/aaai.v38i4.28168

Issue

Section

AAAI Technical Track on Computer Vision III