Which Is More Effective in Label Noise Cleaning, Correction or Filtering?
DOI:
https://doi.org/10.1609/aaai.v38i11.29183Keywords:
ML: Deep Learning Algorithms, ML: Classification and RegressionAbstract
Most noise cleaning methods adopt one of the correction and filtering modes to build robust models. However, their effectiveness, applicability, and hyper-parameter insensitivity have not been carefully studied. We compare the two cleaning modes via a rebuilt error bound in noisy environments. At the dataset level, Theorem 5 implies that correction is more effective than filtering when the cleaned datasets have close noise rates. At the sample level, Theorem 6 indicates that confident label noises (large noise probabilities) are more suitable to be corrected, and unconfident noises (medium noise probabilities) should be filtered. Besides, an imperfect hyper-parameter may have fewer negative impacts on filtering than correction. Unlike existing methods with a single cleaning mode, the proposed Fusion cleaning framework of Correction and Filtering (FCF) combines the advantages of different modes to deal with diverse suspicious labels. Experimental results demonstrate that our FCF method can achieve state-of-the-art performance on benchmark datasets.Downloads
Published
2024-03-24
How to Cite
Jiang, G., Zhang, J., Bai, X., Wang, W., & Meng, D. (2024). Which Is More Effective in Label Noise Cleaning, Correction or Filtering?. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12866-12873. https://doi.org/10.1609/aaai.v38i11.29183
Issue
Section
AAAI Technical Track on Machine Learning II