Ye, Yiran, Thai Le, and Dongwon Lee. “NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models”. Proceedings of the International AAAI Conference on Web and Social Media 19, no. 1 (June 7, 2025): 2603–2612. Accessed May 29, 2026. https://ojs.aaai.org/index.php/ICWSM/article/view/35961.