PrefAce: Face-Centric Pretraining with Self-Structure Aware Distillation

Siyuan Hu; Zheng Wang; Peng Hu; Xi Peng; Jie Wu; Hongyuan Zhu; Yew Soon Ong

doi:10.1609/aaai.v38i11.29147

Authors

Siyuan Hu Nanyang Technological University
Zheng Wang Wuhan University
Peng Hu Sichuan University
Xi Peng Sichuan University
Jie Wu Wuhan University
Hongyuan Zhu Institute for Infocomm Research (I2R) & Centre for Frontier AI Research (CFAR), A*STAR, Singapore
Yew Soon Ong Nanyang Technological University Institute for Infocomm Research (I2R) & Centre for Frontier AI Research (CFAR), A*STAR, Singapore

DOI:

https://doi.org/10.1609/aaai.v38i11.29147

Keywords:

ML: Multimodal Learning, ML: Representation Learning

Abstract

Video-based facial analysis is important for autonomous agents to understand human expressions and sentiments. However, limited labeled data is available to learn effective facial representations. This paper proposes a novel self-supervised face-centric pretraining framework, called PrefAce, which learns transferable video facial representation without labels. The self-supervised learning is performed with an effective landmark-guided global-local tube distillation. Meanwhile, a novel instance-wise update FaceFeat Cache is built to enforce more discriminative and diverse representations for downstream tasks. Extensive experiments demonstrate that the proposed framework learns universal instance-aware facial representations with fine-grained landmark details from videos. The point is that it can transfer across various facial analysis tasks, e.g., Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS). Our framework also outperforms the state-of-the-art on various downstream tasks, even in low data regimes. Code is available at https://github.com/siyuan-h/PrefAce.

PrefAce: Face-Centric Pretraining with Self-Structure Aware Distillation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information