Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition

Authors

  • Xingmei Wang College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Jinghan Liu College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Jiaxiang Meng College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Boquan Li College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Zijian Liu College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China

DOI:

https://doi.org/10.1609/aaai.v39i24.34729

Abstract

Open-set speaker recognition is to identify whether the voices are from the same speaker. One challenge of speaker recognition is collecting large amounts of high-quality data. Based on the promising results of image classification, one intuitively feasible solution is semi-supervised learning (SSL) which uses confidence thresholds to assign pseudo labels for unlabeled data. However, we empirically demonstrated that applying SSL methods to speaker recognition is non-trivial. These methods focus solely on inter-class discrepancy as thresholds to select pseudo labels, overlooking intra-class compactness, which is particularly important for open-set speaker recognition tasks. Motivated by this, we propose Int*-Match, a semi-supervised speaker recognition method selecting reliable pseudo labels with intra-class compactness and inter-class discrepancy for speaker recognition. In particular, we use the inter-class discrepancy of labeled data as the threshold for pseudo-label selection and adjust the threshold based on the intra-class compactness of the pseudo labels dynamically and adaptively. Our systematic experiments demonstrate the superiority of Int*-Match, presenting an outstanding Equal Error Rate (EER) of 1.00% on the VoxCeleb1 original test set, which is merely 0.06% below the performance achieved by fully supervised learning.

Downloads

Published

2025-04-11

How to Cite

Wang, X., Liu, J., Meng, J., Li, B., & Liu, Z. (2025). Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25407–25415. https://doi.org/10.1609/aaai.v39i24.34729

Issue

Section

AAAI Technical Track on Natural Language Processing III