Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection

Han Wang; Deyi Ji; Junyu Lu; Lanyun Zhu; Hailong Zhang; Haiyang Wu; Liqun Liu; Peng Shu; Roy Ka-Wei Lee

doi:10.1609/aaai.v40i46.41288

Authors

Han Wang Singapore University of Technology and Design
Deyi Ji Tencent
Junyu Lu Dalian University of Technology
Lanyun Zhu Nanyang Technological University
Hailong Zhang Tencent
Haiyang Wu Tencent
Liqun Liu Tencent
Peng Shu Tencent
Roy Ka-Wei Lee Singapore University of Technology and Design

DOI:

https://doi.org/10.1609/aaai.v40i46.41288

Abstract

Accurate detection of offensive content on social media demands high-quality labeled data; however, such data is often scarce due to the low prevalence of offensive instances and the high cost of manual annotation. To address this low-resource challenge, we propose a self-training framework that leverages abundant unlabeled data through collaborative pseudo-labeling. Starting with a lightweight classifier trained on limited labeled data, our method iteratively assigns pseudo-labels to unlabeled instances with the support of Multi-Agent Vision-Language Models (MA-VLMs). Unlabeled data on which the classifier and MA-VLMs agree are designated as the Agreed-Unknown set, while conflicting samples form the Disagreed-Unknown set. To enhance label reliability, MA-VLMs simulate dual perspectives, moderator and user, capturing both regulatory and subjective viewpoints. The classifier is optimized using a novel Positive-Negative-Unlabeled (PNU) loss, which jointly exploits labeled, Agreed-Unknown, and Disagreed-Unknown data while mitigating pseudo-label noise. Experiments on benchmark datasets demonstrate that our framework substantially outperforms baselines under limited supervision and approaches the performance of large-scale models.

Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information