Learning Invariant Deep Representation for NIR-VIS Face Recognition

Authors

  • Ran He Institute of Automation, Chinese Academy of Sciences
  • Xiang Wu Institute of Automation, Chinese Academy of Sciences
  • Zhenan Sun Institute of Automation, Chinese Academy of Sciences
  • Tieniu Tan Institute of Automation, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v31i1.10786

Keywords:

deep learning, face recognition, heterogeneous, near infrared, CNN

Abstract

Visual versus near infrared (VIS-NIR) face recognition is still a challenging heterogeneous task due to large appearance difference between VIS and NIR modalities. This paper presents a deep convolutional network approach that uses only one network to map both NIR and VIS images to a compact Euclidean space. The low-level layers of this network are trained only on large-scale VIS data. Each convolutional layer is implemented by the simplest case of maxout operator. The high-level layer is divided into two orthogonal subspaces that contain modality-invariant identity information and modality-variant spectrum information respectively. Our joint formulation leads to an alternating minimization approach for deep representation at the training time and an efficient computation for heterogeneous data at the testing time. Experimental evaluations show that our method achieves 94% verification rate at FAR=0.1% on the challenging CASIA NIR-VIS 2.0 face recognition dataset. Compared with state-of-the-art methods, it reduces the error rate by 58% only with a compact 64-D representation.

Downloads

Published

2017-02-13

How to Cite

He, R., Wu, X., Sun, Z., & Tan, T. (2017). Learning Invariant Deep Representation for NIR-VIS Face Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10786