GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Jian Ma; Yonglin Deng; Chen Chen; Nanyang Du; Haonan Lu; Zhenyu Yang

doi:10.1609/aaai.v39i6.32636

Authors

Jian Ma OPPO AI Center
Yonglin Deng The Chinese University of Hong Kong, ShenZhen
Chen Chen OPPO AI Center
Nanyang Du Tsinghua University
Haonan Lu OPPO AI Center
Zhenyu Yang OPPO AI Center

DOI:

https://doi.org/10.1609/aaai.v39i6.32636

Abstract

Posters serve an essential function in marketing and advertising by improving visual communication and brand visibility, thus significantly contributing to industrial design. With the latest developments in controllable T2I diffusion models, research interest has surged in text rendering within synthesized images. Although text rendering accuracy has seen advancements, automatic poster generation remains a relatively untapped area. This paper presents an automatic poster generation framework featuring text rendering capabilities through the use of LLMs. Our framework employs a triple-cross attention mechanism based on alignment learning to achieve precise text placement within detailed contextual backgrounds. Moreover, it supports adjustable fonts, varying image resolutions, and poster rendering with textual prompts in both English and Chinese. Additionally, we present a comprehensive bilingual image-text dataset, GlyphDraw-3M, comprising 3 million image-text pairs, each with OCR annotations and resolutions exceeding 1024. Our method utilizes the SDXL architecture, and extensive experiments confirm its ability to generate posters with intricate and context-rich backgrounds.

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information