Generative Calibration of Inaccurate Annotation for Label Distribution Learning

Liang He; Yunan Lu; Weiwei Li; Xiuyi Jia

doi:10.1609/aaai.v38i11.29131

Authors

Liang He School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Yunan Lu School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, China
Weiwei Li College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Xiuyi Jia School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, China

DOI:

https://doi.org/10.1609/aaai.v38i11.29131

Keywords:

ML: Multi-class/Multi-label Learning & Extreme Classification

Abstract

Label distribution learning (LDL) is an effective learning paradigm for handling label ambiguity. When applying LDL, it typically requires datasets annotated with label distributions. However, obtaining supervised data for LDL is a challenging task. Due to the randomness of label annotation, the annotator can produce inaccurate annotation results for the instance, affecting the accuracy and generalization ability of the LDL model. To address this problem, we propose a generative approach to calibrate the inaccurate annotation for LDL using variational inference techniques. Specifically, we assume that instances with similar features share latent similar label distributions. The feature vectors and label distributions are generated by Gaussian mixture and Dirichlet mixture, respectively. The relationship between them is established through a shared categorical variable, which effectively utilizes the label distribution of instances with similar features, and achieves a more accurate label distribution through the generative approach. Furthermore, we use a confusion matrix to model the factors that contribute to the inaccuracy during the annotation process, which captures the relationship between label distributions and inaccurate label distributions. Finally, the label distribution is used to calibrate the available information in the noisy dataset to obtain the ground-truth label distribution.

Generative Calibration of Inaccurate Annotation for Label Distribution Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information