VQ-FONT: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization

Authors

  • Mingshuai Yao Harbin Institute of Technology
  • Yabo Zhang Harbin Institute of Technology
  • Xianhui Lin Institute for Intelligent Computing
  • Xiaoming Li Harbin Institute of Technology
  • Wangmeng Zuo Harbin Institute of Technology Peng Cheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v38i15.29577

Keywords:

ML: Transfer, Domain Adaptation, Multi-Task Learning, CV: Computational Photography, Image & Video Synthesis

Abstract

Few-shot font generation is challenging, as it needs to capture the fine-grained stroke styles from a limited set of reference glyphs, and then transfer to other characters, which are expected to have similar styles. However, due to the diversity and complexity of Chinese font styles, the synthesized glyphs of existing methods usually exhibit visible artifacts, such as missing details and distorted strokes. In this paper, we propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement. Specifically, we pre-train a VQGAN to encapsulate font token prior within a code-book. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes. Furthermore, our VQ-Font leverages the inherent design of Chinese characters, where structure components such as radicals and character components are combined in specific arrangements, to recalibrate fine-grained styles based on references. This process improves the matching and fusion of styles at the structure level. Both modules collaborate to enhance the fidelity of the generated fonts. Experiments on a collected font dataset show that our VQ-Font outperforms the competing methods both quantitatively and qualitatively, especially in generating challenging styles. Our code is available at https://github.com/Yaomingshuai/VQ-Font.

Downloads

Published

2024-03-24

How to Cite

Yao, M., Zhang, Y., Lin, X., Li, X., & Zuo, W. (2024). VQ-FONT: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16407-16415. https://doi.org/10.1609/aaai.v38i15.29577

Issue

Section

AAAI Technical Track on Machine Learning VI