[1]
Ye, Y. et al. 2025. NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models. Proceedings of the International AAAI Conference on Web and Social Media. 19, 1 (Jun. 2025), 2603–2612. DOI:https://doi.org/10.1609/icwsm.v19i1.35961.