A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation


  • Yihua Cheng Beihang University
  • Shiyao Huang SenseTime Co., Ltd.
  • Fei Wang SenseTime Co., Ltd.
  • Chen Qian SenseTime Co., Ltd.
  • Feng Lu Beihang University




Human gaze is essential for various appealing applications. Aiming at more accurate gaze estimation, a series of recent works propose to utilize face and eye images simultaneously. Nevertheless, face and eye images only serve as independent or parallel feature sources in those works, the intrinsic correlation between their features is overlooked. In this paper we make the following contributions: 1) We propose a coarse-to-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images. 2) Guided by the proposed strategy, we design a framework which introduces a bi-gram model to bridge gaze residual and basic gaze direction, and an attention component to adaptively acquire suitable fine-grained feature. 3) Integrating the above innovations, we construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap.




How to Cite

Cheng, Y., Huang, S., Wang, F., Qian, C., & Lu, F. (2020). A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 10623-10630. https://doi.org/10.1609/aaai.v34i07.6636



AAAI Technical Track: Vision