Towards Squeezing-Averse Virtual Try-On via Sequential Deformation

Authors

  • Sang-Heon Shim Sungkyunkwan University
  • Jiwoo Chung Sungkyunkwan University
  • Jae-Pil Heo Sungkyunkwan University

DOI:

https://doi.org/10.1609/aaai.v38i5.28288

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Applications

Abstract

In this paper, we first investigate a visual quality degradation problem observed in recent high-resolution virtual try-on approach. The tendency is empirically found that the textures of clothes are squeezed at the sleeve, as visualized in the upper row of Fig.1(a). A main reason for the issue arises from a gradient conflict between two popular losses, the Total Variation (TV) and adversarial losses. Specifically, the TV loss aims to disconnect boundaries between the sleeve and torso in a warped clothing mask, whereas the adversarial loss aims to combine between them. Such contrary objectives feedback the misaligned gradients to a cascaded appearance flow estimation, resulting in undesirable squeezing artifacts. To reduce this, we propose a Sequential Deformation (SD-VITON) that disentangles the appearance flow prediction layers into TV objective-dominant (TVOB) layers and a task-coexistence (TACO) layer. Specifically, we coarsely fit the clothes onto a human body via the TVOB layers, and then keep on refining via the TACO layer. In addition, the bottom row of Fig.1(a) shows a different type of squeezing artifacts around the waist. To address it, we further propose that we first warp the clothes into a tucked-out shirts style, and then partially erase the texture from the warped clothes without hurting the smoothness of the appearance flows. Experimental results show that our SD-VITON successfully resolves both types of artifacts and outperforms the baseline methods. Source code will be available at https://github.com/SHShim0513/SD-VITON.

Published

2024-03-24

How to Cite

Shim, S.-H., Chung, J., & Heo, J.-P. (2024). Towards Squeezing-Averse Virtual Try-On via Sequential Deformation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), 4856-4863. https://doi.org/10.1609/aaai.v38i5.28288

Issue

Section

AAAI Technical Track on Computer Vision IV