GitHub - OpenTAI/wild-deepfake: deepfake dataset collected on the web for deepfake detection

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

📌 Dataset Description

Existing deepfake datasets like DeepfakeDetection and FaceForensics++ have advanced detection research but are limited by constrained real videos featuring a few actors and fake videos generated using popular software. As a result, detectors trained on these datasets often struggle with the diversity of real-world deepfakes found online.

To address this, we introduce WildDeepfake, a dataset of 7,314 face sequences from 707 deepfake videos sourced entirely from the internet. Despite its small size, WildDeepfake better represents the challenges of real-world detection, where baseline detectors show significantly reduced performance.

To enhance detection, we also propose Attention-based Deepfake Detection Networks (ADDNets), utilizing 2D and 3D attention mechanisms to improve focus on real/fake facial features.

📂 Dataset Contents

A comparision to previous datasets (before our work)

Dataset name	Download	Generate method	Deepfake videos	Actors
Deepfake-TIMIT low	download	Deepfake	320	32
Deepfake-TIMIT high	download	Deepfake	320	32
Faceforensics	-	Deepfake	1000	977
Faceforensics++	download	Deepfake	1000	977
Deepfake detection	download	Deepfake	over3000	28
Celeb-deepfakeforensics v1	download	Deepfake	795	13
Celeb-deepfakeforensics v2	download	Deepfake	590	59
DFDC	download	Deepfake	-	-
WildDeepfake	download	Internet	707	Unknown

File Structure:

deepfake_in_the_wild
                    |--real train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--real test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...
                    |--fake train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--fake test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...

In each tar.gz file, there will be several folders containing face images, and the images in each folder represent a face sequence. The image name in the folder represents the frame number it appears in the original video.

⬇️ Request for Download

You will need to fill an agreement form to use the dataset, which is now avalibble on Hugging Face click to download.

📜 Cite Us

If you use this dataset in your research, please cite it as follows:

@inproceedings{zi2020wilddeepfake,
  title={Wilddeepfake: A challenging real-world dataset for deepfake detection},
  author={Zi, Bojia and Chang, Minghao and Chen, Jingjing and Ma, Xingjun and Jiang, Yu-Gang},
  booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
  pages={2382--2390},
  year={2020}
}

📝 Privacy Statement

To ensure the privacy of individuals featured in the dataset, we have implemented the following measures:

Restricted Use: The dataset is strictly for research purposes, and only face sequences are released, not full videos.
Privacy Protection in Publications: Key facial features are obscured in all visual materials, including papers and presentations. Additionally, strict access controls are in place.
Applicant Verification: Access is granted only after verifying the applicant’s academic email address, personal electronic signature, and other necessary credentials.
Usage Agreement: Applicants are required to sign a comprehensive agreement to ensure the dataset is used exclusively for research purposes.
Right to Removal: If any part of the dataset impacts you, please contact us to request its removal.

We are committed to safeguarding privacy while enabling research advancements.

Name		Name	Last commit message	Last commit date
Latest commit History 217 Commits
ADDNet.png		ADDNet.png
DF-TIMIT.png		DF-TIMIT.png
DFD.png		DFD.png
DFDC.png		DFDC.png
FF++.png		FF++.png
README.md		README.md
WD.png		WD.png
details.jpg		details.jpg
fakemask.jpg		fakemask.jpg
t-sne.PNG		t-sne.PNG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

📌 Dataset Description

📂 Dataset Contents

⬇️ Request for Download

📜 Cite Us

📝 Privacy Statement

About

Releases

Packages

Contributors 2

OpenTAI/wild-deepfake

Folders and files

Latest commit

History

Repository files navigation

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

📌 Dataset Description

📂 Dataset Contents

⬇️ Request for Download

📜 Cite Us

📝 Privacy Statement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages