RGBD Based Gaze Estimation via Multi-Task CNN

Dongze Lian; Ziheng Zhang; Weixin Luo; Lina Hu; Minye Wu; Zechao Li; Jingyi Yu; Shenghua Gao

doi:10.1609/aaai.v33i01.33012488

Authors

Dongze Lian Shanghaitech University
Ziheng Zhang Shanghaitech University
Weixin Luo Shanghaitech University
Lina Hu Shanghaitech University
Minye Wu Shanghaitech University
Zechao Li Nanjing University of Science and Technology
Jingyi Yu Shanghai Tech University
Shenghua Gao Shanghaitech University

DOI:

https://doi.org/10.1609/aaai.v33i01.33012488

Abstract

This paper tackles RGBD based gaze estimation with Convolutional Neural Networks (CNNs). Specifically, we propose to decompose gaze point estimation into eyeball pose, head pose, and 3D eye position estimation. Compared with RGB image-based gaze tracking, having depth modality helps to facilitate head pose estimation and 3D eye position estimation. The captured depth image, however, usually contains noise and black holes which noticeably hamper gaze tracking. Thus we propose a CNN-based multi-task learning framework to simultaneously refine depth images and predict gaze points. We utilize a generator network for depth image generation with a Generative Neural Network (GAN), where the generator network is partially shared by both the gaze tracking network and GAN-based depth synthesizing. By optimizing the whole network simultaneously, depth image synthesis improves gaze point estimation and vice versa. Since the only existing RGBD dataset (EYEDIAP) is too small, we build a large-scale RGBD gaze tracking dataset for performance evaluation. As far as we know, it is the largest RGBD gaze dataset in terms of the number of participants. Comprehensive experiments demonstrate that our method outperforms existing methods by a large margin on both our dataset and the EYEDIAP dataset.

RGBD Based Gaze Estimation via Multi-Task CNN

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription