Skip to content

Purdue-M2/AI-Face-FairnessBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Face-FairnessBench

This repository is the official implementation of our paper AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Dataset Overview

License

The AI-Face Dataset is licensed under CC BY-NC-ND 4.0

Download

If you would like to access the AI-Face Dataset, please download and sign the EULA. Please upload the signed EULA to the Google Form and fill the required details. Once the form is approved, the download link will be sent to you. If you have any questions, please send an email to lin1785@purdue.edu, hu968@purdue.edu

1. Installation

You can run the following script to configure the necessary environment:

cd AI-Face-FairnessBench
conda create -n FairnessBench python=3.9.0
conda activate FairnessBench
pip install -r requirements.txt

2. Dataset Preparation and Description

After getting our AI-Face dataset, put the provided train.csv and test.csv within AI-Face dataset under ./dataset.

train.csv and test.csv is formatted:

Column Description
Image Path Path to the image file
Uncertainty Score Gender Uncertainty score for gender annotation
Uncertainty Score Age Uncertainty score for age annotation
Uncertainty Score Race Uncertainty score for race annotation
Ground Truth Gender Gender label: 1 - Male, 0 - Female
Ground Truth Age Age label: 0 - Young, 1 - Middle-aged, 2 - Senior, 3 - Others
Ground Truth Race Race label: 0 - Asian, 1 - White, 2 - Black, 3 - Others
Intersection 0-(Male,Asian), 1-(Male,White), 2-(Male,Black), 3-(Male,Others), 4-(Female,Asian), 5-(Female,White), 6-(Female,Black), 7-(Female,Others)
Target Label indicating real (0) or fake (1) image

📝 Note

Our AI-Face dataset contains face images from four deepfake video datasets: FF++, Celeb-DF, DFD and DFDC. You can access these datasets with demongraphic annotaions from paper through the link provided in our Fairness-Generalization repository. Please be aware that we re-annotated demographic attributes for those four deepfake video datasets in our AI-Face dataset, and the demographic annotations are provided with uncertainty score formatted in a CSV file as described above. The annotations you can acquire through our Fairness-Generalization are different with those provided in our AI-Face dataset, and they are not accompained with uncertianty scores.

After you get the download link for the AI-Face dataset, you will see part1.tar and part2.tar. Please download both parts if you are going to use the entire dataset. They are uploaded in two parts because One Drive only allows files not larger than 250GB.

Requirements

Ensure your device has 300GB of available space for this dataset.

Instructions

  1. Download part1.tar and part2.tar.
  2. Untar both files.
  3. Organize the data as shown below:
AI-Face Dataset
  ├── AttGAN
  ├── Latent_Diffusion
  ├── Palette
  ├── ...
  ├── ...

3. Load Pretrained Weights

Before running the training code, make sure you load the pre-trained weights. We provide pre-trained weights under ./training/pretrained. You can also download Xception model trained on ImageNet (through this link) or use your own pretrained Xception.

4. Train

To run the training code, you should first go to the ./training/ folder, then run train_test.py:

cd training

python train_test.py 

You can adjust the parameters in train_test.py to specify the parameters, e.g., model, batchsize, learning rate, etc.

--lr: learning rate, default is 0.0005.

--train_batchsize: batchsize for training, default is 128.

--test_batchsize: batchsize for testing, default is 32.

--datapath: /path/to/dataset.

--model: detector name ['xception', 'efficientnet', 'core', 'ucf', 'srm', 'f3net', 'spsl', 'daw_fdd', 'dag_fdd', 'fair_df_detector'], default is 'xception'.

--dataset_type: dataset type loaded for detectors, default is 'no_pair'. For 'ucf' and 'fair_df_detector', it should be 'pair'.

📝 Note

To train ViT-b/16 and UnivFD, please run train_test_vit.py and train_test_clip.py, respectively.

File name Paper
Xception xception_detector.py Xception: Deep learning with depthwise separable convolutions
EfficientNet-B4 efficientnetb4_detector.py Efficientnet: Rethinking model scaling for convolutional neural networks
ViT-B/16 train_test_vit.py An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
UCF ucf_detector.py UCF: Uncovering Common Features for Generalizable Deepfake Detection
UnivFD train_test_clip.py Towards Universal Fake Image Detectors that Generalize Across Generative Models
CORE core_detector.py CORE: Consistent Representation Learning for Face Forgery Detection
F3Net f3net_detector.py Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues
SRM srm_detector.py Generalizing Face Forgery Detection with High-frequency Features
SPSL spsl_detector.py Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
DAW-FDD daw_fdd.py Improving Fairness in Deepfake Detection
DAG-FDD dag_fdd.py Improving Fairness in Deepfake Detection
PG-FDD fair_df_detector.py Preserving Fairness Generalization in Deepfake Detection

If you use the AI-face dataset in your research, please cite our paper as:

@article{lin2024aiface,
  title={AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark},
  author={Li Lin and Santosh and Xin Wang and Shu Hu},
  journal={arXiv preprint arXiv:2406.00783},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages