Download and Preprocessing scripts for VoxCeleb datasets

This is an auxiliary repo for downloading VoxCeleb videos and preprocessing of the extracted frames by cropping them around the face. For detecting and cropping the face area we use the landmark estimation method proposed in [1], face-alignment.

Installation

Python 3.5+
Linux
Pytorch (>=1.5)

Instal requirments:

pip install -r requirements.txt

Install youtube-dl:

pip install --upgrade youtube_dl

Install ffmpeg

sudo apt-get install ffmpeg

Download auxilliary models and save them under `./pretrained_models`

Path	Description
FaceDetector	SFD face detector for face-alignment.

Overview

Download videos of VoxCeleb1 or VoxCeleb2 dataset from youtube
Split videos in smaller ones using the metadata provided by the datasets and delete original videos
Extract frames from each video with REF_FPS = 25
Crop frames using the face boxes from the metadata and facial landmarks
Files are saved as:

.path/to/voxdataset
|-- id10271                           # identity index
|   |-- 37nktPRUJ58                   # video index
|   |   |-- chunk_videos              # chunk_videos: original video splitted in smaller ones
|   |   |   |-- 37nktPRUJ58#00001#257-396.mp4 
|   |   |   |-- ...
|   |   |-- frames                    # extracted frames
|   |   |   |-- 00_000025.png
|   |   |   |-- ...
|   |   |-- frames_cropped            # preprocessed frames
|   |   |    |-- 00_000025.png
|   |   |    |-- ...
|   |-- Zjc7Xy7aT8c
|   |   | ...
|-- id10273
|   | ...

Download VoxCeleb datasets

Download metadata from VoxCeleb1 and VoxCeleb2

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_test_txt.zip
unzip vox1_test_txt.zip
mv ./txt ./vox1_txt_test

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_dev_txt.zip
unzip vox1_dev_txt.zip
mv ./txt ./vox1_txt_train

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_test_txt.zip
unzip vox2_test_txt.zip
mv ./txt ./vox2_txt_test

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_dev_txt.zip
unzip vox2_dev_txt.zip
mv ./txt ./vox2_txt_train

Run this script to download videos from youtube. Note that the original videos will be removed. Optionally extract and preprocess frames.

python download_voxCeleb.py --dataset vox1 --output_path ./VoxCeleb1_test --metadata_path ./vox1_txt_test --delete_mp4

Preprocessing of video frames

If videos have already been downloaded, run this script to extract and preprocess frames.

python preprocess_voxCeleb.py --dataset vox1 --root_path ./VoxCeleb1_test --metadata_path ./vox1_txt_test

Acknowledgments

This code borrows from video-preprocessing and face-alignment.

References

[1] Bulat, Adrian, and Georgios Tzimiropoulos. "How far are we from solving the 2D & 3D face alignment problem?(and a dataset of 230,000 3d facial landmarks)." Proceedings of the IEEE International Conference on Computer Vision. 2017.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
images		images
libs		libs
.gitignore		.gitignore
README.md		README.md
download_voxCeleb.py		download_voxCeleb.py
preprocess_voxCeleb.py		preprocess_voxCeleb.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Download and Preprocessing scripts for VoxCeleb datasets

Installation

Instal requirments:

Install youtube-dl:

Install ffmpeg

Download auxilliary models and save them under `./pretrained_models`

Overview

Download VoxCeleb datasets

Preprocessing of video frames

Acknowledgments

References

About

Releases

Packages

Languages

StelaBou/voxceleb_preprocessing

Folders and files

Latest commit

History

Repository files navigation

Download and Preprocessing scripts for VoxCeleb datasets

Installation

Instal requirments:

Install youtube-dl:

Install ffmpeg

Download auxilliary models and save them under ./pretrained_models

Overview

Download VoxCeleb datasets

Preprocessing of video frames

Acknowledgments

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Download auxilliary models and save them under `./pretrained_models`

Packages