GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Authors

  • Jian Ma OPPO AI Center
  • Yonglin Deng The Chinese University of Hong Kong, ShenZhen
  • Chen Chen OPPO AI Center
  • Nanyang Du Tsinghua University
  • Haonan Lu OPPO AI Center
  • Zhenyu Yang OPPO AI Center

DOI:

https://doi.org/10.1609/aaai.v39i6.32636

Abstract

Posters serve an essential function in marketing and advertising by improving visual communication and brand visibility, thus significantly contributing to industrial design. With the latest developments in controllable T2I diffusion models, research interest has surged in text rendering within synthesized images. Although text rendering accuracy has seen advancements, automatic poster generation remains a relatively untapped area. This paper presents an automatic poster generation framework featuring text rendering capabilities through the use of LLMs. Our framework employs a triple-cross attention mechanism based on alignment learning to achieve precise text placement within detailed contextual backgrounds. Moreover, it supports adjustable fonts, varying image resolutions, and poster rendering with textual prompts in both English and Chinese. Additionally, we present a comprehensive bilingual image-text dataset, GlyphDraw-3M, comprising 3 million image-text pairs, each with OCR annotations and resolutions exceeding 1024. Our method utilizes the SDXL architecture, and extensive experiments confirm its ability to generate posters with intricate and context-rich backgrounds.

Downloads

Published

2025-04-11

How to Cite

Ma, J., Deng, Y., Chen, C., Du, N., Lu, H., & Yang, Z. (2025). GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(6), 5955–5963. https://doi.org/10.1609/aaai.v39i6.32636

Issue

Section

AAAI Technical Track on Computer Vision V