Ye, Yiran, et al. “NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models”. Proceedings of the International AAAI Conference on Web and Social Media, vol. 19, no. 1, June 2025, pp. 2603-12, doi:10.1609/icwsm.v19i1.35961.