Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos

Congcong Zhu; Hao Liu*(corresponding author); Zhenhua Yu; Xuehong Sun

doi:10.1609/aaai.v34i07.7011

Authors

Congcong Zhu Shanghai University
Hao Liu*(corresponding author) Ningxia University
Zhenhua Yu Ningxia University
Xuehong Sun Ningxia University

DOI:

https://doi.org/10.1609/aaai.v34i07.7011

Abstract

In this paper, we propose a spatial-temporal relational reasoning networks (STRRN) approach to investigate the problem of omni-supervised face alignment in videos. Unlike existing fully supervised methods which rely on numerous annotations by hand, our learner exploits large scale unlabeled videos plus available labeled data to generate auxiliary plausible training annotations. Motivated by the fact that neighbouring facial landmarks are usually correlated and coherent across consecutive frames, our approach automatically reasons about discriminative spatial-temporal relationships among landmarks for stable face tracking. Specifically, we carefully develop an interpretable and efficient network module, which disentangles facial geometry relationship for every static frame and simultaneously enforces the bi-directional cycle-consistency across adjacent frames, thus allowing the modeling of intrinsic spatial-temporal relations from raw face sequences. Extensive experimental results demonstrate that our approach surpasses the performance of most fully supervised state-of-the-arts.

Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information