Cross-Domain Human Parsing via Adversarial Feature and Label Adaptation

Si Liu; Yao Sun; Defa Zhu; Guanghui Ren; Yu Chen; Jiashi Feng; Jizhong Han

doi:10.1609/aaai.v32i1.12320

Authors

Si Liu Institute of Information Engineering, Chinese Academy of Sciences
Yao Sun Institute of Information Engineering, Chinese Academy of Sciences
Defa Zhu Institute of Information Engineering, Chinese Academy of Sciences
Guanghui Ren Institute of Information Engineering, Chinese Academy of Sciences
Yu Chen JD.com
Jiashi Feng National University of Singapore
Jizhong Han Institute of Information Engineering, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v32i1.12320

Keywords:

semantic segmentation, cross domain

Abstract

Human parsing has been extensively studied recently due to its wide applications in many important scenarios. Mainstream fashion parsing models (i.e., parsers) focus on parsing the high-resolution and clean images. However, directly applying the parsers trained on benchmarks of high-quality samples to a particular application scenario in the wild, e.g., a canteen, airport or workplace, often gives non-satisfactory performance due to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient cross-domain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences. A discriminative feature adversarial network is introduced to supervise the feature compensation to effectively reduces the discrepancy between feature distributions of two domains. Besides, our proposed model also introduces a structured label adversarial network to guide the parsing results of the target domain to follow the high-order relationships of the structured labels shared across domains. The proposed framework is end-to-end trainable, practical and scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows without any annotations, are evaluated as target domains. The results consistently confirm data efficiency and performance advantages of the proposed method for the challenging cross-domain human parsing problem.

Cross-Domain Human Parsing via Adversarial Feature and Label Adaptation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription