ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Mengyang Wu; Yuzhi Zhao; Jialun Cao; Mingjie Xu; Zhongming Jiang; Xuehui Wang; Qinbin Li; Guangneng Hu; Shengchao Qin; Chi-Wing Fu

doi:10.1609/aaai.v39i8.32908

Authors

Mengyang Wu Department of Computer Science and Engineering, The Chinese University of Hong Kong Huawei Hong Kong Research Center
Yuzhi Zhao Huawei Hong Kong Research Center
Jialun Cao Department of Computer Science and Engineering, The Hong Kong University of Science and Technology
Mingjie Xu Huawei 2012 Laboratories
Zhongming Jiang Huawei 2012 Laboratories
Xuehui Wang Artificial Intelligence Institute, Shanghai Jiao Tong University
Qinbin Li School of Computer Science and Technology, Huazhong University of Science and Technology
Guangneng Hu School of Computer Science and Technology, Xidian University
Shengchao Qin Guangzhou Institute of Technology, Xidian University ICTT and ISN Laboratory, Xidian University
Chi-Wing Fu Department of Computer Science and Engineering, The Chinese University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v39i8.32908

Abstract

Controversial contents largely inundate the Internet, infringing various cultural norms and child protection standards. Traditional Image Content Moderation (ICM) models fall short in producing precise moderation decisions for diverse standards, while recent multimodal large language models (MLLMs), when adopted to general rule-based ICM, often produce classification and explanation results that are inconsistent with human moderators. Aiming at flexible, explainable, and accurate ICM, we design a novel rule-based dataset generation pipeline, decomposing concise human-defined rules and leveraging well-designed multi-stage prompts to enrich short explicit image annotations. Our ICM-Instruct dataset includes detailed moderation explanation and moderation Q-A pairs. Built upon it, we create our ICM-Assistant model in the framework of rule-based ICM, making it readily applicable in real practice. Our ICM-Assistant model demonstrates exceptional performance and flexibility. Specifically, it significantly outperforms existing approaches on various sources, improving both the moderation classification (36.8% on average) and moderation explanation quality (26.6% on average) consistently over existing MLLMs. Caution: Content includes offensive language or images.

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information