LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

Mushui Liu; Yuhang Ma; Zhen Yang; Jun Dan; Yunlong Yu; Zeng Zhao; Zhipeng Hu; Bai Liu; Changjie Fan

doi:10.1609/aaai.v39i5.32588

Authors

Mushui Liu Zhejiang University Fuxi AI Lab, NetEase
Yuhang Ma Fuxi AI Lab, NetEase
Zhen Yang The Hong Kong University of Science and Technology
Jun Dan Zhejiang University
Yunlong Yu Zhejiang University
Zeng Zhao Fuxi AI Lab, NetEase
Zhipeng Hu Fuxi AI Lab, NetEase
Bai Liu Fuxi AI Lab, NetEase
Changjie Fan Fuxi AI Lab, NetEase

DOI:

https://doi.org/10.1609/aaai.v39i5.32588

Abstract

Diffusion models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts involving multiple objects, attribute binding, and long descriptions. In this paper, we propose a novel framework called LLM4GEN, which enhances the semantic understanding of text-to-image diffusion models by leveraging the representation of Large Language Models (LLMs). It can be seamlessly incorporated into various diffusion models as a plug-and-play component. A specially designed Cross-Adapter Module (CAM) integrates the original text features of text-to-image models with LLM features, thereby enhancing text-to-image generation. Additionally, to facilitate and correct entity-attribute relationships in text prompts, we develop an entity-guided regularization loss to further improve generation performance. We also introduce DensePrompts, which contains 7,000 dense prompts to provide a comprehensive evaluation for the text-to-image generation task. Experiments indicate that LLM4GEN significantly improves the semantic alignment of SD1.5 and SDXL, demonstrating increases of 9.69% and 12.90% in color on T2I-CompBench, respectively. Moreover, it surpasses existing models in terms of sample quality, image-text alignment, and human evaluation.

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information