Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection
DOI:
https://doi.org/10.1609/aaai.v40i46.41288Abstract
Accurate detection of offensive content on social media demands high-quality labeled data; however, such data is often scarce due to the low prevalence of offensive instances and the high cost of manual annotation. To address this low-resource challenge, we propose a self-training framework that leverages abundant unlabeled data through collaborative pseudo-labeling. Starting with a lightweight classifier trained on limited labeled data, our method iteratively assigns pseudo-labels to unlabeled instances with the support of Multi-Agent Vision-Language Models (MA-VLMs). Unlabeled data on which the classifier and MA-VLMs agree are designated as the Agreed-Unknown set, while conflicting samples form the Disagreed-Unknown set. To enhance label reliability, MA-VLMs simulate dual perspectives, moderator and user, capturing both regulatory and subjective viewpoints. The classifier is optimized using a novel Positive-Negative-Unlabeled (PNU) loss, which jointly exploits labeled, Agreed-Unknown, and Disagreed-Unknown data while mitigating pseudo-label noise. Experiments on benchmark datasets demonstrate that our framework substantially outperforms baselines under limited supervision and approaches the performance of large-scale models.Published
2026-03-14
How to Cite
Wang, H., Ji, D., Lu, J., Zhu, L., Zhang, H., Wu, H., Liu, L., Shu, P., & Lee, R. K.-W. (2026). Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(46), 39387-39396. https://doi.org/10.1609/aaai.v40i46.41288
Issue
Section
AAAI Special Track on AI for Social Impact II