Noise-Robust Learning from Multiple Unsupervised Sources of Inferred Labels

Amila Silva; Ling Luo; Shanika Karunasekera; Christopher Leckie

doi:10.1609/aaai.v36i8.20806

Authors

Amila Silva The University of Melbourne, Australia
Ling Luo The University of Melbourne, Australia
Shanika Karunasekera The University of Melbourne, Australia
Christopher Leckie The University of Melbourne, Australia

DOI:

https://doi.org/10.1609/aaai.v36i8.20806

Keywords:

Machine Learning (ML)

Abstract

Deep Neural Networks (DNNs) generally require large-scale datasets for training. Since manually obtaining clean labels for large datasets is extremely expensive, unsupervised models based on domain-specific heuristics can be used to efficiently infer the labels for such datasets. However, the labels from such inferred sources are typically noisy, which could easily mislead and lessen the generalizability of DNNs. Most approaches proposed in the literature to address this problem assume the label noise depends only on the true class of an instance (i.e., class-conditional noise). However, this assumption is not realistic for the inferred labels as they are typically inferred based on the features of the instances. The few recent attempts to model such instance-dependent (i.e., feature-dependent) noise require auxiliary information about the label noise (e.g., noise rates or clean samples). This work proposes a theoretically motivated framework to correct label noise in the presence of multiple labels inferred from unsupervised models. The framework consists of two modules: (1) MULTI-IDNC, a novel approach to correct label noise that is instance-dependent yet not class-conditional; (2) MULTI-CCNC, which extends an existing class-conditional noise-robust approach to yield improved class-conditional noise correction using multiple noisy label sources. We conduct experiments using nine real-world datasets for three different classification tasks (images, text and graph nodes). Our results show that our approach achieves notable improvements (e.g., 6.4% in accuracy) against state-of-the-art baselines while dealing with both instance-dependent and class-conditional noise in inferred label sources.

Noise-Robust Learning from Multiple Unsupervised Sources of Inferred Labels

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription