CIFAKE includes 120,000 images: 60,000 real images and 60,000 AI-generated synthetic images. The dataset is split into two primary categories:
- REAL: These images are sourced from the CIFAR-10 dataset, curated by Krizhevsky & Hinton.
- FAKE: These images are synthetically generated using Stable Diffusion version 1.4, mimicking the style and structure of CIFAR-10.
Dataset Specifications:
- Training Set: 100,000 images (50,000 per class)
- Testing Set: 20,000 images (10,000 per class)
- Classes: REAL and FAKE
Research Relevance
The CIFAKE dataset supports critical research in AI image classification and the development of explainable AI models. With its balanced combination of real and synthetic images, it offers the perfect foundation for training and evaluating machine learning models aimed at detecting AI-generated content.
This dataset is sourced from Kaggle.