ThatiAR: Subjectivity Detection in Arabic News Sentences
DOI:
https://doi.org/10.1609/icwsm.v19i1.35960Abstract
In this study, we present the first large dataset, ThatiAR, for subjectivity detection in Arabic, consisting of ~3.6K manually annotated sentences, and GPT-4o based explanations. In addition, we include instructions (both in English and Arabic) to facilitate LLM based fine-tuning. We provide an in-depth analysis of the dataset, annotation process, and extensive benchmark results, including PLMs and LLMs. Our analysis of the annotation process highlights that annotators were strongly influenced by their political, cultural, and religious backgrounds, especially at the beginning of the annotation process. The experimental results suggest that LLMs with in-context learning provide better performance. We release the dataset and resources to the community.Downloads
Published
2025-06-07
How to Cite
Suwaileh, R., Hasanain, M., Hubail, F., Zaghouani, W., & Alam, F. (2025). ThatiAR: Subjectivity Detection in Arabic News Sentences. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 2587-2602. https://doi.org/10.1609/icwsm.v19i1.35960
Issue
Section
Dataset Papers