Aggregating Diverse Cue Experts for AI-Generated Image Detection

Authors

  • Lei Tan National University of Singapore
  • Shuwei Li National University of Singapore
  • Mohan Kankanhalli National University of Singapore
  • Robby T. Tan National University of Singapore ASUS Intelligent Cloud Services (AICS)

DOI:

https://doi.org/10.1609/aaai.v40i11.37890

Abstract

The rapid emergence of image synthesis models poses challenges to the generalization of AI-generated image detectors. However, existing methods often rely on model-specific features, leading to overfitting and poor generalization. In this paper, we introduce the Multi-Cue Aggregation Network (MCAN), a novel framework that integrates different yet complementary cues as input. MCAN employs a mixture-of-encoders adapter to dynamically process these cues, enabling more adaptive and robust feature representation. Our cues include the input image itself, which represents the overall content, and high-frequency components that emphasize edge details. Additionally, we introduce a Chromatic Inconsistency (CI) cue, which normalizes intensity values and captures noise information introduced during the image acquisition process in real images, making these noise patterns more distinguishable from those in AI-generated content. Unlike prior methods, MCAN employs a multi-cue aggregation strategy, leveraging spatial, frequency, and chromaticity-based cues. These cues are intrinsically more indicative of real images, enhancing cross-model generalization. Extensive experiments on the GenImage, Chameleon, and UniversalFakeDetect benchmark validate the state-of-the-art performance of MCAN. In the GenImage dataset, MCAN outperforms the best state-of-the-art method by up to 7.4\% in average ACC across eight different image generators.

Downloads

Published

2026-03-14

How to Cite

Tan, L., Li, S., Kankanhalli, M., & Tan, R. T. (2026). Aggregating Diverse Cue Experts for AI-Generated Image Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(11), 9314–9322. https://doi.org/10.1609/aaai.v40i11.37890

Issue

Section

AAAI Technical Track on Computer Vision VIII