Layout Generation as Intermediate Action Sequence Prediction
DOI:
https://doi.org/10.1609/aaai.v37i9.26277Keywords:
ML: Deep Generative Models & Autoencoders, CV: Applications, CV: Computational Photography, Image & Video Synthesis, ML: Applications, ML: Deep Neural Network Algorithms, ML: Evaluation and Analysis (Machine Learning)Abstract
Layout generation plays a crucial role in graphic design intelligence. One important characteristic of the graphic layouts is that they usually follow certain design principles. For example, the principle of repetition emphasizes the reuse of similar visual elements throughout the design. To generate a layout, previous works mainly attempt at predicting the absolute value of bounding box for each element, where such target representation has hidden the information of higher-order design operations like repetition (e.g. copy the size of the previously generated element). In this paper, we introduce a novel action schema to encode these operations for better modeling the generation process. Instead of predicting the bounding box values, our approach autoregressively outputs the intermediate action sequence, which can then be deterministically converted to the final layout. We achieve state-of-the-art performances on three datasets. Both automatic and human evaluations show that our approach generates high-quality and diverse layouts. Furthermore, we revisit the commonly used evaluation metric FID adapted in this task, and observe that previous works use different settings to train the feature extractor for obtaining real/generated data distribution, which leads to inconsistent conclusions. We conduct an in-depth analysis on this metric and settle for a more robust and reliable evaluation setting. Code is available at this website.Downloads
Published
2023-06-26
How to Cite
Yang, H., Huang, D., Lin, C.-Y., & He, S. (2023). Layout Generation as Intermediate Action Sequence Prediction. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 10762–10770. https://doi.org/10.1609/aaai.v37i9.26277
Issue
Section
AAAI Technical Track on Machine Learning IV