FaceCoresetNet: Differentiable Coresets for Face Set Recognition

Authors

  • Gil Shapira Samsung Bar-Ilan University
  • Yosi Keller Bar Ilan University

DOI:

https://doi.org/10.1609/aaai.v38i5.28276

Keywords:

CV: Biometrics, Face, Gesture & Pose, CV: Learning & Optimization for CV, CV: Object Detection & Categorization, ML: Classification and Regression, ML: Deep Learning Algorithms, ML: Dimensionality Reduction/Feature Selection

Abstract

In set-based face recognition, we aim to compute the most discriminative descriptor from an unbounded set of images and videos showing a single person. A discriminative descriptor balances two policies when aggregating information from a given set. The first is a quality-based policy: emphasizing high-quality and down-weighting low-quality images. The second is a diversity-based policy: emphasizing unique images in the set and down-weighting multiple occurrences of similar images as found in video clips which can overwhelm the set representation. This work frames face-set representation as a differentiable coreset selection problem. Our model learns how to select a small coreset of the input set that balances quality and diversity policies using a learned metric parameterized by the face quality, optimized end-to-end. The selection process is a differentiable farthest-point sampling (FPS) realized by approximating the non-differentiable Argmax operation with differentiable sampling from the Gumbel-Softmax distribution of distances. The small coreset is later used as queries in a self and cross-attention architecture to enrich the descriptor with information from the whole set. Our model is order-invariant and linear in the input set size. We set a new SOTA to set face verification on the IJB-B and IJB-C datasets. Our code is publicly available at https://github.com/ligaripash/FaceCoresetNet.

Published

2024-03-24

How to Cite

Shapira, G., & Keller, Y. (2024). FaceCoresetNet: Differentiable Coresets for Face Set Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), 4748-4756. https://doi.org/10.1609/aaai.v38i5.28276

Issue

Section

AAAI Technical Track on Computer Vision IV