Beyond Metadata: Multimodal, Policy-Aware Detection of YouTube Scam Videos
DOI:
https://doi.org/10.1609/icwsm.v20i1.42698Abstract
YouTube is a major platform for information and entertainment, but its wide accessibility also makes it attractive for scammers to upload deceptive or malicious content. Prior detection approaches rely largely on textual or statistical metadata, such as titles, descriptions, view counts, or likes, which are effective in many cases but can be evaded through benign-looking text, manipulated statistics, or other obfuscation strategies (e.g., ‘Leetspeak’), while ignoring visual cues. In this study, we systematically investigate multimodal approaches for detecting YouTube scams. Our dataset consolidates established scam categories and augments them with full-length videos and policy-grounded reasoning annotations. Experiments show that a text-only model using titles and descriptions (fine-tuned BERT) achieves moderate performance (76.61% F1 score), improving slightly with audio transcripts (77.98% F1 score). Visual analysis with a fine-tuned LLaVA-Video model performs better (79.61% F1 score), while a multimodal framework combining titles, descriptions, and video frames achieves the highest performance (82.96% F1 score). Moreover, the multimodal framework showed greater robustness to adversarial perturbations, with accuracy dropping only 1–3%, compared to 12–38% for modality-specific models. Beyond accuracy, the multimodal framework provides interpretable, policy-grounded reasoning, enhancing transparency and practical utility in automated moderation. Using this approach, we analyzed 6,374 in-the-wild YouTube videos and detected 1,864 scams with explicit reasoning, providing a valuable resource for future research.Downloads
Published
2026-05-25
How to Cite
Kulsum, U., Sabir, A., S.B., A., & Das, A. (2026). Beyond Metadata: Multimodal, Policy-Aware Detection of YouTube Scam Videos. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 1331–1348. https://doi.org/10.1609/icwsm.v20i1.42698
Issue
Section
Full Papers