SLaNT: A Semi-supervised Label Noise-Tolerant Framework for Text Sentiment Analysis

Bin Cao; Kai Jiang; Jing Fan

doi:10.1609/icwsm.v18i1.31307

Authors

Bin Cao Zhejiang University of Technology
Kai Jiang Zhejiang University of Technology
Jing Fan Zhejiang University of Technology

DOI:

https://doi.org/10.1609/icwsm.v18i1.31307

Abstract

The exponential growth of user-generated comment data on social media platforms has greatly promoted research on text sentiment analysis. However, the presence of conflicting sentiments within user comments, known as 'user comments with noisy labels', poses a significant challenge to the reliability of sentiment analysis models. Many current approaches address this issue by either discarding noisy samples or assigning small weights to them during training, but these strategies can lead to sample wastage and reduced model robustness. In this paper, we present SLaNT, a novel semi-supervised label noise-tolerant framework specifically designed for text sentiment analysis. SLaNT employs a four-module pipeline that includes Noisy Data Identification, Data Augmentation, Noisy Data Relabeling, and Re-training. Notably, SLaNT introduces an early stopping strategy to efficiently identify noisy samples. Additionally, to mitigate confirmation bias during the relabeling of noisy data, a unique co-relabeling strategy based on ensemble learning is integrated into SLaNT. Experimental results on four text user comment datasets demonstrate that SLaNT significantly outperforms four selected strong baselines.

SLaNT: A Semi-supervised Label Noise-Tolerant Framework for Text Sentiment Analysis

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information