Ye Y, Le T, Lee D. NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models. ICWSM [Internet]. 2025 Jun. 7 [cited 2026 May 29];19(1):2603-12. Available from: https://ojs.aaai.org/index.php/ICWSM/article/view/35961