A Dataset for Analysing News Framing in Chinese Media

Owen Cook; Yida Mu; Xinye Yang; Xingyi Song; Kalina Bontcheva

doi:10.1609/icwsm.v19i1.35943

Authors

Owen Cook University of Sheffield
Yida Mu University of Sheffield
Xinye Yang University of Sheffield
Xingyi Song University of Sheffield
Kalina Bontcheva University of Sheffield

DOI:

https://doi.org/10.1609/icwsm.v19i1.35943

Abstract

Framing is an essential device in news reporting, allowing writers to influence public perceptions of current affairs. While automatic news framing detection datasets exist in various languages, none focus on news framing in the Chinese language, which presents unique challenges with complex character meanings and unique linguistic features. This study introduces the first Chinese News Framing dataset, to be used as either a stand-alone dataset or a supplementary resource to the SemEval-2023 task 3 dataset. We detail its creation and conduct baseline experiments to demonstrate the need for such a dataset and create benchmarks for future research, providing results obtained through fine-tuning XLM-RoBERTa-Base and using GPT-4o in the zero-shot setting. We find that GPT-4o performs significantly worse than fine-tuned XLM-RoBERTa across all languages. For the Chinese language, we obtain an F1-micro (the performance metric for SemEval task 3, subtask 2) score of 0.719 using only samples from our Chinese News Framing dataset and a score of 0.753 when we augment the SemEval dataset with Chinese news framing samples. With positive news frame detection results, this dataset is a valuable resource for detecting news frames in the Chinese language and is a useful supplement to the SemEval-2023 task 3 dataset.

A Dataset for Analysing News Framing in Chinese Media

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information