HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models

Authors

  • Zhifeng Xie Department of Film and Television Engineering, Shanghai University Shanghai Engineering Research Center of Motion Picture Special Effects
  • Hao Li Department of Film and Television Engineering, Shanghai University
  • Huiming Ding Department of Film and Television Engineering, Shanghai University
  • Mengtian Li Department of Film and Television Engineering, Shanghai University Shanghai Engineering Research Center of Motion Picture Special Effects
  • Xinhan Di AI Lab, Giant Network
  • Ying Cao School of Information Science and Technology, ShanghaiTech University

DOI:

https://doi.org/10.1609/aaai.v39i8.32947

Abstract

Fashion design is a challenging and complex process. Recent works on fashion generation and editing are all agnostic of the actual fashion design process, which limits their usage in practice. In this paper, we propose a novel hierarchical diffusion-based framework tailored for fashion design, coined as HieraFashDiff. Our model is designed to mimic the practical fashion design workflow, by unraveling the denosing process into two successive stages: 1) an ideation stage that generates design proposals given high-level concepts and 2) an iteration stage that continuously refines the proposals using low-level attributes. Our model supports fashion design generation and fine-grained local editing in a single framework. To train our model, we contribute a new dataset of full-body fashion images annotated with hierarchical text descriptions. Extensive evaluations show that, as compared to prior approaches, our method can generate fashion designs and edited results with higher fidelity and better prompt adherence, showing its promising potential to augment the practical fashion design workflow.

Published

2025-04-11

How to Cite

Xie, Z., Li, H., Ding, H., Li, M., Di, X., & Cao, Y. (2025). HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8762–8770. https://doi.org/10.1609/aaai.v39i8.32947

Issue

Section

AAAI Technical Track on Computer Vision VII