Which Is More Effective in Label Noise Cleaning, Correction or Filtering?

Gaoxia Jiang; Jia Zhang; Xuefei Bai; Wenjian Wang; Deyu Meng

doi:10.1609/aaai.v38i11.29183

Authors

Gaoxia Jiang Shanxi University
Jia Zhang Shanxi University
Xuefei Bai Shanxi University
Wenjian Wang Shanxi University
Deyu Meng Xi'an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v38i11.29183

Keywords:

ML: Deep Learning Algorithms, ML: Classification and Regression

Abstract

Most noise cleaning methods adopt one of the correction and filtering modes to build robust models. However, their effectiveness, applicability, and hyper-parameter insensitivity have not been carefully studied. We compare the two cleaning modes via a rebuilt error bound in noisy environments. At the dataset level, Theorem 5 implies that correction is more effective than filtering when the cleaned datasets have close noise rates. At the sample level, Theorem 6 indicates that confident label noises (large noise probabilities) are more suitable to be corrected, and unconfident noises (medium noise probabilities) should be filtered. Besides, an imperfect hyper-parameter may have fewer negative impacts on filtering than correction. Unlike existing methods with a single cleaning mode, the proposed Fusion cleaning framework of Correction and Filtering (FCF) combines the advantages of different modes to deal with diverse suspicious labels. Experimental results demonstrate that our FCF method can achieve state-of-the-art performance on benchmark datasets.

Which Is More Effective in Label Noise Cleaning, Correction or Filtering?

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription