Skip to content

deepfake dataset collected on the web for deepfake detection

Notifications You must be signed in to change notification settings

OpenTAI/wild-deepfake

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

Deepfake in the Wild

📌 Dataset Description

Existing deepfake datasets like DeepfakeDetection and FaceForensics++ have advanced detection research but are limited by constrained real videos featuring a few actors and fake videos generated using popular software. As a result, detectors trained on these datasets often struggle with the diversity of real-world deepfakes found online.

To address this, we introduce WildDeepfake, a dataset of 7,314 face sequences from 707 deepfake videos sourced entirely from the internet. Despite its small size, WildDeepfake better represents the challenges of real-world detection, where baseline detectors show significantly reduced performance.

To enhance detection, we also propose Attention-based Deepfake Detection Networks (ADDNets), utilizing 2D and 3D attention mechanisms to improve focus on real/fake facial features.

📂 Dataset Contents

  1. A comparision to previous datasets (before our work)
Dataset name Download Generate method Deepfake videos Actors
Deepfake-TIMIT low download Deepfake 320 32
Deepfake-TIMIT high download Deepfake 320 32
Faceforensics - Deepfake 1000 977
Faceforensics++ download Deepfake 1000 977
Deepfake detection download Deepfake over3000 28
Celeb-deepfakeforensics v1 download Deepfake 795 13
Celeb-deepfakeforensics v2 download Deepfake 590 59
DFDC download Deepfake - -
WildDeepfake download Internet 707 Unknown
  1. File Structure:
deepfake_in_the_wild
                    |--real train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--real test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...
                    |--fake train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--fake test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...

In each tar.gz file, there will be several folders containing face images, and the images in each folder represent a face sequence. The image name in the folder represents the frame number it appears in the original video.

⬇️ Request for Download

You will need to fill an agreement form to use the dataset, which is now avalibble on Hugging Face click to download.

📜 Cite Us

If you use this dataset in your research, please cite it as follows:

@inproceedings{zi2020wilddeepfake,
  title={Wilddeepfake: A challenging real-world dataset for deepfake detection},
  author={Zi, Bojia and Chang, Minghao and Chen, Jingjing and Ma, Xingjun and Jiang, Yu-Gang},
  booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
  pages={2382--2390},
  year={2020}
}

📝 Privacy Statement

To ensure the privacy of individuals featured in the dataset, we have implemented the following measures:

  • Restricted Use: The dataset is strictly for research purposes, and only face sequences are released, not full videos.
  • Privacy Protection in Publications: Key facial features are obscured in all visual materials, including papers and presentations. Additionally, strict access controls are in place.
  • Applicant Verification: Access is granted only after verifying the applicant’s academic email address, personal electronic signature, and other necessary credentials.
  • Usage Agreement: Applicants are required to sign a comprehensive agreement to ensure the dataset is used exclusively for research purposes.
  • Right to Removal: If any part of the dataset impacts you, please contact us to request its removal.

We are committed to safeguarding privacy while enabling research advancements.

About

deepfake dataset collected on the web for deepfake detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published