Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning
DOI:
https://doi.org/10.1609/icwsm.v20i1.42712Abstract
This study provides a comprehensive benchmark of Large Language Model (LLM) adaptation paradigms, specifically Fine-Tuning (FT) vs. In-Context Learning (ICL), for the detection of hyperpartisan news, fake news, political bias, and harmful content. We evaluate encoder-only and decoder-only architectures across 10 datasets in five languages: English, Spanish, Portuguese, Arabic, and Bulgarian. Our analysis covers a wide spectrum of ICL strategies, including zero-shot prompts, rule-based codebooks, Chain-of-Thought (CoT), and few-shot selection using both random and diversity-optimized (Determinantal Point Process) exemplars. Experimental results reveal that FT consistently outperforms ICL; notably, smaller fine-tuned models often surpass the performance of larger models (e.g., Llama-3.1-8B, Mistral-Nemo, and Qwen2.5-7B) used in ICL settings. We further find that model architecture suitability is task-dependent: fine-tuned decoders excel at political bias and fake news detection, while encoders remain superior for hyperpartisan and harmful tweet classification. Among ICL methods, the codebook approach generally yields the highest accuracy, frequently outperforming CoT. Our findings underscore that despite the versatility of LLMs, task-specific fine-tuning remains the most effective strategy for identifying nuanced problematic content online.Downloads
Published
2026-05-25
How to Cite
Maggini, M. J., Merzougui, D., Bandyopadhyay, R., Dias, G., Maurel, F., & Gamallo, P. (2026). Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 1551–1572. https://doi.org/10.1609/icwsm.v20i1.42712
Issue
Section
Full Papers