SLaNT: A Semi-supervised Label Noise-Tolerant Framework for Text Sentiment Analysis

Authors

  • Bin Cao Zhejiang University of Technology
  • Kai Jiang Zhejiang University of Technology
  • Jing Fan Zhejiang University of Technology

DOI:

https://doi.org/10.1609/icwsm.v18i1.31307

Abstract

The exponential growth of user-generated comment data on social media platforms has greatly promoted research on text sentiment analysis. However, the presence of conflicting sentiments within user comments, known as 'user comments with noisy labels', poses a significant challenge to the reliability of sentiment analysis models. Many current approaches address this issue by either discarding noisy samples or assigning small weights to them during training, but these strategies can lead to sample wastage and reduced model robustness. In this paper, we present SLaNT, a novel semi-supervised label noise-tolerant framework specifically designed for text sentiment analysis. SLaNT employs a four-module pipeline that includes Noisy Data Identification, Data Augmentation, Noisy Data Relabeling, and Re-training. Notably, SLaNT introduces an early stopping strategy to efficiently identify noisy samples. Additionally, to mitigate confirmation bias during the relabeling of noisy data, a unique co-relabeling strategy based on ensemble learning is integrated into SLaNT. Experimental results on four text user comment datasets demonstrate that SLaNT significantly outperforms four selected strong baselines.

Downloads

Published

2024-05-28

How to Cite

Cao, B., Jiang, K., & Fan, J. (2024). SLaNT: A Semi-supervised Label Noise-Tolerant Framework for Text Sentiment Analysis. Proceedings of the International AAAI Conference on Web and Social Media, 18(1), 191-202. https://doi.org/10.1609/icwsm.v18i1.31307