Residual Encoder Decoder Network and Adaptive Prior for Face Parsing

Tianchu Guo; Youngsung Kim; Hui Zhang; Deheng Qian; ByungIn Yoo; Jingtao Xu; Dongqing Zou; Jae-Joon Han; Changkyu Choi

doi:10.1609/aaai.v32i1.12268

Authors

Tianchu Guo Beijing Samsung Telecommunication
Youngsung Kim Samsung Advanced Institute of Technology
Hui Zhang Beijing Samsung Telecommunication
Deheng Qian Beijing Samsung Telecommunication
ByungIn Yoo Samsung Advanced Insitute of Technology
Jingtao Xu Beijing Samsung Telecommunication
Dongqing Zou Beijing Samsung Telecommunication
Jae-Joon Han Samsung Advanced Institute of Technology
Changkyu Choi Samsung Advanced Institue of Technology

DOI:

https://doi.org/10.1609/aaai.v32i1.12268

Keywords:

face parsing, encoder decoder, redisual network, adaptive prior

Abstract

Face Parsing assigns every pixel in a facial image with a semantic label, which could be applied in various applications including face recognition, facial beautification, affective computing and animation. While lots of progress have been made in this field, current state-of-the-art methods still fail to extract real effective feature and restore accurate score map, especially for those facial parts which have large variations of deformation and fairly similar appearance, e.g. mouth, eyes and thin eyebrows. In this paper, we propose a novel pixel-wise face parsing method called Residual Encoder Decoder Network (RED-Net), which combines a feature-rich encoder-decoder framework with adaptive prior mechanism. Our encoder-decoder framework extracts feature with ResNet and decodes the feature by elaborately fusing the residual architectures in to deconvolution. This framework learns more effective feature comparing to that learnt by decoding with interpolation or classic deconvolution operations. To overcome the appearance ambiguity between facial parts, an adaptive prior mechanism is proposed in term of the decoder prediction confidence, allowing refining the final result. The experimental results on two public datasets demonstrate that our method outperforms the state-of-the-arts significantly, achieving improvements of F-measure from 0.854 to 0.905 on Helen dataset, and pixel accuracy from 95.12% to 97.59% on the LFW dataset. In particular, convincing qualitative examples show that our method parses eye, eyebrow, and lip regins more accurately.

Residual Encoder Decoder Network and Adaptive Prior for Face Parsing

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information