SimLabel: Similarity-Weighted Semi-supervision for Multi-annotator Learning with Missing Labels

Authors

  • Liyun Zhang Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
  • Zheng Lian National Key Laboratory of Autonomous Intelligent Unmanned Systems, Tongji University, Shanghai, China
  • Hong Liu School of Informatics, Xiamen University, Fujian, China
  • Takanori Takebe Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati OH, USA
  • Yuta Nakashima Institute of Scientific and Industrial Research, The University of Osaka, Osaka, Japan

DOI:

https://doi.org/10.1609/aaai.v40i33.40061

Abstract

Multi-annotator learning (MAL) aims to model annotator-specific labeling patterns. However, existing methods face a critical challenge: they simply skip updating annotator-specific model parameters when encountering missing labels—a common scenario in real-world crowdsourced datasets where each annotator labels only small subsets of samples. This leads to inefficient data utilization and overfitting risks. To this end, we propose a novel similarity-weighted semi-supervised learning framework (SimLabel) that leverages inter-annotator similarities to generate weighted soft labels for missing annotations, enabling the utilization of unannotated samples rather than skipping them entirely. We further introduce a confidence-based iterative refinement mechanism that combines maximum probability with entropy-based uncertainty to prioritize predicted high-quality pseudo-labels to impute missing labels, jointly enhancing similarity estimation and model performance over time. For evaluation, we contribute a new multimodal multi-annotator dataset, AMER2, with high and more variable missing rates, reflecting real-world annotation sparsity and enabling evaluation across different sparsity levels. Extensive experiments validate the effectiveness of our method.

Downloads

Published

2026-03-14

How to Cite

Zhang, L., Lian, Z., Liu, H., Takebe, T., & Nakashima, Y. (2026). SimLabel: Similarity-Weighted Semi-supervision for Multi-annotator Learning with Missing Labels. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 28328–28336. https://doi.org/10.1609/aaai.v40i33.40061

Issue

Section

AAAI Technical Track on Machine Learning X