DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning

Authors

  • Won-Seok Choi Seoul National University
  • Hyundo Lee Seoul National University
  • Dong-Sig Han Seoul National University
  • Junseok Park Seoul National University
  • Heeyeon Koo Yonsei University
  • Byoung-Tak Zhang Seoul National University AI Institute of Seoul National University (AIIS)

DOI:

https://doi.org/10.1609/aaai.v38i10.29040

Keywords:

ML: Bio-inspired Learning, ML: Representation Learning, ML: Unsupervised & Self-Supervised Learning

Abstract

Recent machine learning algorithms have been developed using well-curated datasets, which often require substantial cost and resources. On the other hand, the direct use of raw data often leads to overfitting towards frequently occurring class information. To address class imbalances cost-efficiently, we propose an active data filtering process during self-supervised pre-training in our novel framework, Duplicate Elimination (DUEL). This framework integrates an active memory inspired by human working memory and introduces distinctiveness information, which measures the diversity of the data in the memory, to optimize both the feature extractor and the memory. The DUEL policy, which replaces the most duplicated data with new samples, aims to enhance the distinctiveness information in the memory and thereby mitigate class imbalances. We validate the effectiveness of the DUEL framework in class-imbalanced environments, demonstrating its robustness and providing reliable results in downstream tasks. We also analyze the role of the DUEL policy in the training process through various metrics and visualizations.

Published

2024-03-24

How to Cite

Choi, W.-S., Lee, H., Han, D.-S., Park, J., Koo, H., & Zhang, B.-T. (2024). DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11579-11587. https://doi.org/10.1609/aaai.v38i10.29040

Issue

Section

AAAI Technical Track on Machine Learning I