GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Yirui Chen; Xudong Huang; Quan Zhang; Wei Li; Mingjian Zhu; Qiangyu Yan; Simiao Li; Hanting Chen; Hailin Hu; Jie Yang; Wei Liu; Jie Hu

doi:10.1609/aaai.v39i2.32231

Authors

Yirui Chen Shanghai Jiao Tong University Huawei Noah's Ark Lab
Xudong Huang Huawei Noah's Ark Lab
Quan Zhang Tsinghua University, Tsinghua University Huawei Noah's Ark Lab
Wei Li Huawei Noah's Ark Lab
Mingjian Zhu Huawei Noah's Ark Lab
Qiangyu Yan Huawei Noah's Ark Lab
Simiao Li Huawei Noah's Ark Lab
Hanting Chen Huawei Noah's Ark Lab
Hailin Hu Huawei Noah's Ark Lab
Jie Yang Shanghai Jiao Tong University
Wei Liu Shanghai Jiao Tong University
Jie Hu Huawei Noah's Ark Lab

DOI:

https://doi.org/10.1609/aaai.v39i2.32231

Abstract

The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location (IMDL). However, the lack of a large-scale data foundation makes the IMDL task unattainable. In this paper, we build a local manipulation data generation pipeline that integrates the powerful capabilities of SAM, LLM, and generative models. Upon this basis, we propose the GIM dataset, which has the following advantages: 1) Large scale, GIM includes over one million pairs of AI-manipulated images and real images. 2) Rich image content, GIM encompasses a broad range of image classes. 3) Diverse generative manipulation, the images are manipulated images with state-of-the-art generators and various manipulation tasks. The aforementioned advantages allow for a more comprehensive evaluation of IMDL methods, extending their applicability to diverse images. We introduce the GIM benchmark with two settings to evaluate existing IMDL methods. In addition, we propose a novel IMDL framework, termed GIMFormer, which consists of a ShadowTracer, Frequency-Spatial block (FSB), and a Multi-Window Anomalous Modeling (MWAM) module. Extensive experiments on the GIM demonstrate that GIMFormer surpasses the previous state-of-the-art approach on two different benchmarks.

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information