ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning

Delyan Boychev, Radostin Cholakov

2024-07-30

ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning

Summary

This paper discusses ImagiNet, a new dataset created to help detect synthetic images, which are computer-generated and can look very similar to real photos. The dataset aims to improve the ability of AI systems to identify these images and understand their origins.

What's the problem?

As technology advances, generative models can create images that are almost indistinguishable from real ones. This makes it difficult for online platforms to detect fake images, which can lead to issues like impersonation and misinformation. There aren't enough diverse datasets available for training AI systems to recognize these synthetic images effectively, which limits their performance.

What's the solution?

To solve this problem, the authors developed ImagiNet, a dataset with 200,000 examples of both synthetic and real images across four categories: photos, paintings, faces, and uncategorized. The dataset includes images generated by various models and real images collected from public datasets. It allows for two types of evaluations: determining whether an image is real or synthetic and identifying which model created the synthetic image. The authors also trained a ResNet-50 model using this dataset and achieved impressive results in detecting synthetic images.

Why it matters?

This research is important because it provides a valuable resource for improving AI's ability to detect fake images. By enhancing the detection capabilities of AI systems, ImagiNet can help combat misinformation and protect users from being misled by synthetic content online. This is crucial in maintaining trust in digital media and ensuring accurate information dissemination.

Abstract

Generative models, such as diffusion models (DMs), variational autoencoders (VAEs), and generative adversarial networks (GANs), produce images with a level of authenticity that makes them nearly indistinguishable from real photos and artwork. While this capability is beneficial for many industries, the difficulty of identifying synthetic images leaves online media platforms vulnerable to impersonation and misinformation attempts. To support the development of defensive methods, we introduce ImagiNet, a high-resolution and balanced dataset for synthetic image detection, designed to mitigate potential biases in existing resources. It contains 200K examples, spanning four content categories: photos, paintings, faces, and uncategorized. Synthetic images are produced with open-source and proprietary generators, whereas real counterparts of the same content type are collected from public datasets. The structure of ImagiNet allows for a two-track evaluation system: i) classification as real or synthetic and ii) identification of the generative model. To establish a baseline, we train a ResNet-50 model using a self-supervised contrastive objective (SelfCon) for each track. The model demonstrates state-of-the-art performance and high inference speed across established benchmarks, achieving an AUC of up to 0.99 and balanced accuracy ranging from 86% to 95%, even under social network conditions that involve compression and resizing. Our data and code are available at https://github.com/delyan-boychev/imaginet.

View Paper