Ye, Y., Le, T., & Lee, D. (2025). NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 2603–2612. https://doi.org/10.1609/icwsm.v19i1.35961