FashionMAC: Deformation-Free Fashion Image Generation with Fine-Grained Model Appearance Customization
DOI:
https://doi.org/10.1609/aaai.v40i15.38266Abstract
Garment-centric fashion image generation aims to synthesize realistic and controllable human models dressing a given garment, which has attracted growing interest due to its practical applications in e-commerce. The key challenges of the task lie in two aspects: (1) faithfully preserving the garment details, and (2) gaining fine-grained controllability over the model's appearance. Existing methods typically require performing garment deformation in the generation process, which often leads to garment texture distortions. Also, they fail to control the fine-grained attributes of the generated models, due to the lack of specifically designed mechanisms. To address these issues, we propose FashionMAC, a novel diffusion-based deformation-free framework that achieves high-quality and controllable fashion showcase image generation. The core idea of our framework is to eliminate the need for performing garment deformation and directly outpaint the garment segmented from a dressed person, which enables faithful preservation of the intricate garment details. Moreover, we propose a novel region-adaptive decoupled attention (RADA) mechanism along with a chained mask injection strategy to achieve fine-grained appearance controllability over the synthesized human models. Specifically, RADA adaptively predicts the generated regions for each fine-grained text attribute and enforces the text attribute to focus on the predicted regions by a chained mask injection strategy, significantly enhancing the visual fidelity and the controllability. Extensive experiments validate the superior performance of our framework compared to existing state-of-the-art methods.Published
2026-03-14
How to Cite
Zhang, R., Li, J., Wang, J., Zuo, Z., Dong, J., Li, W., … Wang, X. (2026). FashionMAC: Deformation-Free Fashion Image Generation with Fine-Grained Model Appearance Customization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 12699–12707. https://doi.org/10.1609/aaai.v40i15.38266
Issue
Section
AAAI Technical Track on Computer Vision XII