ThatiAR: Subjectivity Detection in Arabic News Sentences

Authors

  • Reem Suwaileh Hamad bin Khalifa University
  • Maram Hasanain Qatar Computing Research Institute
  • Fatema Hubail Free University of Berlin
  • Wajdi Zaghouani Northwestern University Qatar
  • Firoj Alam HBKU

DOI:

https://doi.org/10.1609/icwsm.v19i1.35960

Abstract

In this study, we present the first large dataset, ThatiAR, for subjectivity detection in Arabic, consisting of ~3.6K manually annotated sentences, and GPT-4o based explanations. In addition, we include instructions (both in English and Arabic) to facilitate LLM based fine-tuning. We provide an in-depth analysis of the dataset, annotation process, and extensive benchmark results, including PLMs and LLMs. Our analysis of the annotation process highlights that annotators were strongly influenced by their political, cultural, and religious backgrounds, especially at the beginning of the annotation process. The experimental results suggest that LLMs with in-context learning provide better performance. We release the dataset and resources to the community.

Downloads

Published

2025-06-07

How to Cite

Suwaileh, R., Hasanain, M., Hubail, F., Zaghouani, W., & Alam, F. (2025). ThatiAR: Subjectivity Detection in Arabic News Sentences. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 2587-2602. https://doi.org/10.1609/icwsm.v19i1.35960