Plot’n Polish: Zero-Shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models

Kiymet Akdemir; Jing Shi; Kushal Kafle; Brian L. Price; Pinar Yanardag

doi:10.1609/aaai.v40i3.37147

Plot’n Polish: Zero-Shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models

Authors

Kiymet Akdemir Virginia Tech, Blacksburg, VA, USA
Jing Shi Adobe Research, San Jose, CA, USA
Kushal Kafle Adobe Research, San Jose, CA, USA
Brian L. Price Adobe Research, San Jose, CA, USA
Pinar Yanardag Virginia Tech, Blacksburg, VA, USA

DOI:

https://doi.org/10.1609/aaai.v40i3.37147

Abstract

Text-to-image diffusion models have demonstrated significant capabilities to generate diverse and detailed visuals in various domains, and story visualization is emerging as a particularly promising application. However, as their use in real-world creative domains increases, the need for providing enhanced control, refinement, and the ability to modify images post-generation in a consistent manner becomes an important challenge. Existing methods often lack the flexibility to apply fine or coarse edits while maintaining visual and narrative consistency across multiple frames, preventing creators from seamlessly crafting and refining their visual stories. To address these challenges, we introduce Plot'n Polish, a zero-shot framework that enables consistent story generation and provides fine-grained control over story visualizations at various levels of detail.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

PDF
Poster

Published

2026-03-14

How to Cite

Akdemir, K., Shi, J., Kafle, K., Price, B. L., & Yanardag, P. (2026). Plot’n Polish: Zero-Shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(3), 1694–1702. https://doi.org/10.1609/aaai.v40i3.37147

Download Citation

Issue

Vol. 40 No. 3: AAAI-26 Technical Tracks 3

Section

AAAI Technical Track on Cognitive Modeling & Cognitive Systems

Plot’n Polish: Zero-Shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information