mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion

Authors

  • Geng Chen College of CSSE, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
  • Wuyuan Xie College of CSSE, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University
  • Di Lin College of Intelligence and Computing, Tianjin University
  • Ye Liu School of Automation, Nanjing University of Posts and Telecommunications
  • Miaohui Wang College of CSSE, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University

DOI:

https://doi.org/10.1609/aaai.v39i1.31980

Abstract

The increasing number of presentation attacks on reliable face matching has raised concerns and garnered attention towards face anti-spoofing (FAS). However, existing methods for FAS modeling commonly fuse multiple visual modalities (e.g., RGB, Depth, and Infrared) in a straightforward manner, disregarding latent feature gaps that can hinder representation learning. To address this challenge, we propose a novel multimodal FAS framework (mmFAS) that focuses on explicit alignment and fusion of latent features across different modalities. Specifically, we develop a multimodal alignment module to alleviate the latent feature gap by using instance-level contrastive learning and class-level matching simultaneously. Further, we explore a new switch-attention based fusion module to automatically aggregate complementary information and control model complexity. To evaluate the anti-spoofing performance more effectively, we adopt a challenging yet meaningful cross-database protocol involving four benchmark multimodal FAS datasets to simulate realworld scenarios. Extensive experimental results demonstrate the effectiveness of mmFAS in improving the accuracy of FAS systems, outperforming 10 representative methods.

Published

2025-04-11

How to Cite

Chen, G., Xie, W., Lin, D., Liu, Y., & Wang, M. (2025). mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 58-66. https://doi.org/10.1609/aaai.v39i1.31980

Issue

Section

AAAI Technical Track on Application Domains