FashionMAC: Deformation-Free Fashion Image Generation with Fine-Grained Model Appearance Customization

Rong Zhang; Jinxiao Li; Jingnan Wang; Zhiwen Zuo; Jianfeng Dong; Wei Li; Chi Wang; Weiwei Xu; Xun Wang

doi:10.1609/aaai.v40i15.38266

Authors

Rong Zhang Zhejiang Gongshang University
Jinxiao Li Zhejiang Gongshang University
Jingnan Wang Zhejiang Gongshang University
Zhiwen Zuo Zhejiang Gongshang University
Jianfeng Dong Zhejiang Gongshang University
Wei Li Nanjing University
Chi Wang Zhejiang University
Weiwei Xu Zhejiang University
Xun Wang Zhejiang Gongshang University

DOI:

https://doi.org/10.1609/aaai.v40i15.38266

Abstract

Garment-centric fashion image generation aims to synthesize realistic and controllable human models dressing a given garment, which has attracted growing interest due to its practical applications in e-commerce. The key challenges of the task lie in two aspects: (1) faithfully preserving the garment details, and (2) gaining fine-grained controllability over the model's appearance. Existing methods typically require performing garment deformation in the generation process, which often leads to garment texture distortions. Also, they fail to control the fine-grained attributes of the generated models, due to the lack of specifically designed mechanisms. To address these issues, we propose FashionMAC, a novel diffusion-based deformation-free framework that achieves high-quality and controllable fashion showcase image generation. The core idea of our framework is to eliminate the need for performing garment deformation and directly outpaint the garment segmented from a dressed person, which enables faithful preservation of the intricate garment details. Moreover, we propose a novel region-adaptive decoupled attention (RADA) mechanism along with a chained mask injection strategy to achieve fine-grained appearance controllability over the synthesized human models. Specifically, RADA adaptively predicts the generated regions for each fine-grained text attribute and enforces the text attribute to focus on the predicted regions by a chained mask injection strategy, significantly enhancing the visual fidelity and the controllability. Extensive experiments validate the superior performance of our framework compared to existing state-of-the-art methods.

FashionMAC: Deformation-Free Fashion Image Generation with Fine-Grained Model Appearance Customization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information