SafeLens: Segment-Level Hate Speech Detection in Online Videos

Zhuoran Wang; Dylan Raharja; Yujia Hu; Roy Ka-Wei Lee

doi:10.1609/aaai.v40i48.42390

SafeLens: Segment-Level Hate Speech Detection in Online Videos

Authors

Zhuoran Wang Singapore University of Technology and Design
Dylan Raharja Singapore University of Technology and Design
Yujia Hu Singapore University of Technology and Design
Roy Ka-Wei Lee Singapore University of Technology and Design

DOI:

https://doi.org/10.1609/aaai.v40i48.42390

Abstract

We present SafeLens, a lightweight segment-level video moderation system that fuses speech, text, and visual frames to produce hateful content detection for each segment. For every segment, SafeLens returns a structured prediction: label, prediction confidence, reasons for flag, harm categories. The structured predictions are optimized for triage, appeals, and downstream enforcement. The system is modular (pluggable speech, text, and visual processing modules back-ends and a mid-size policy Language Language Model (LLM) agent with parameter-efficient tuning). In the live demo, attendees can upload or select clips, scrub the timeline to flag hateful segments, inspect rationales, and vary the policy LLM agent to benchmark the hateful content moderation performance.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

Published

2026-03-14

How to Cite

Wang, Z., Raharja, D., Hu, Y., & Lee, R. K.-W. (2026). SafeLens: Segment-Level Hate Speech Detection in Online Videos. Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41712–41714. https://doi.org/10.1609/aaai.v40i48.42390

Download Citation

Issue

Vol. 40 No. 48: EAAI-26 AI for Education, Model AI Assignments, AAAI-26 Emerging Trends, Doctoral Consortium, Student Abstracts, Undergraduate Consortium and Demonstrations

Section

AAAI Demonstration Track

SafeLens: Segment-Level Hate Speech Detection in Online Videos

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information