DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection
Hai Ci, Ziheng Peng, Pei Yang, Yingxin Xuan, Mike Zheng Shou
2025-11-26
Summary
This paper introduces a new dataset called DiffSeg30k, which is designed to help detect edits made to images by AI image generators, specifically those using a technique called diffusion. It's becoming harder to tell what's real and what's AI-generated, and this dataset aims to improve our ability to spot those fakes.
What's the problem?
Currently, most methods for detecting AI-generated images just try to classify whether an *entire* image is real or fake. However, diffusion-based editing often only changes *parts* of an image, and existing tools miss these localized edits. This means someone could subtly alter a photo with AI, and it wouldn't be detected. There wasn't a good way to train AI to pinpoint *where* the changes were made, only if the whole image was suspect.
What's the solution?
The researchers created DiffSeg30k, a collection of 30,000 images that have been edited using eight different AI image generators. What makes this dataset special is that it includes detailed 'pixel-level annotations' – basically, it shows exactly which pixels in each image were changed by the AI. They also made the edits realistic by having the AI make multiple changes to each image, like adding, removing, or changing things. They used a system that automatically figured out what parts of the image to edit and what changes to make, making the edits more natural. They then tested existing AI 'segmentation' tools (which are good at identifying different parts of an image) to see how well they could both locate the edits *and* identify which AI generator was used.
Why it matters?
This work is important because it shifts the focus from simply detecting if an image is AI-generated to *where* the AI edits are. The dataset and the experiments show that segmentation models can not only pinpoint the altered areas but also surprisingly work well at identifying if an image has been edited at all, even better than some existing forgery detectors. This research will help develop better tools to identify and understand AI-generated content, which is crucial as AI image generation becomes more common and sophisticated.
Abstract
Diffusion-based editing enables realistic modification of local image regions, making AI-generated content harder to detect. Existing AIGC detection benchmarks focus on classifying entire images, overlooking the localization of diffusion-based edits. We introduce DiffSeg30k, a publicly available dataset of 30k diffusion-edited images with pixel-level annotations, designed to support fine-grained detection. DiffSeg30k features: 1) In-the-wild images--we collect images or image prompts from COCO to reflect real-world content diversity; 2) Diverse diffusion models--local edits using eight SOTA diffusion models; 3) Multi-turn editing--each image undergoes up to three sequential edits to mimic real-world sequential editing; and 4) Realistic editing scenarios--a vision-language model (VLM)-based pipeline automatically identifies meaningful regions and generates context-aware prompts covering additions, removals, and attribute changes. DiffSeg30k shifts AIGC detection from binary classification to semantic segmentation, enabling simultaneous localization of edits and identification of the editing models. We benchmark three baseline segmentation approaches, revealing significant challenges in semantic segmentation tasks, particularly concerning robustness to image distortions. Experiments also reveal that segmentation models, despite being trained for pixel-level localization, emerge as highly reliable whole-image classifiers of diffusion edits, outperforming established forgery classifiers while showing great potential in cross-generator generalization. We believe DiffSeg30k will advance research in fine-grained localization of AI-generated content by demonstrating the promise and limitations of segmentation-based methods. DiffSeg30k is released at: https://huggingface.co/datasets/Chaos2629/Diffseg30k