Authors official PyTorch implementation of the Finding Directions in GAN's Latent Space for Neural Face Reenactment. This paper has been accepted as an oral presentation at British Machine Vision Conference (BMVC), 2022. If you use this code for your research, please cite our paper.
Finding Directions in GAN's Latent Space for Neural Face Reenactment
Stella Bounareli, Vasileios Argyriou, Georgios TzimiropoulosAbstract: This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face. Previous methods focus on learning embedding networks for identity and pose disentanglement which proves to be a rather hard task, degrading the quality of the generated images. We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs which have been shown capable of producing high-quality facial images. Because GANs are characterized by weak controllability, the core of our approach is a method to discover which directions in latent GAN space are responsible for controlling facial pose and expression variations. We present a simple pipeline to learn such directions with the aid of a 3D shape model which, by construction, already captures disentangled directions for facial pose, identity and expression. Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces. Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment. Our qualitative and quantitative results show that our approach often produces reenacted faces of significantly higher quality than those produced by state-of-the-art methods for the standard benchmarks of VoxCeleb1 & 2.
Real image editing of head pose and expression
Self and Cross-subject Reenactment
We recommend running this repository using Anaconda.
conda create -n python38 python=3.8
conda activate python38
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d
pip install -r requirements.txt
We provide a StyleGAN2 model trained using StyleGAN2-ada-pytorch and an e4e inversion model trained on VoxCeleb1 dataset.
Path | Description |
---|---|
StyleGAN2-VoxCeleb1 | StyleGAN2 trained on VoxCeleb1 dataset. |
e4e-VoxCeleb1 | e4e trained on VoxCeleb1 dataset. |
We provide additional auxiliary models needed during training.
Path | Description |
---|---|
face-detector | Pretrained face detector taken from face-alignment. |
IR-SE50 Model | Pretrained IR-SE50 model taken from InsightFace_Pytorch for use in our identity loss. |
DECA model | Pretrained model taken from DECA. Extract data.tar.gz under ./libs/DECA/ . |
By default, we assume that all pretrained models are downloaded and saved to the directory ./pretrained_models
.
-
Download and preprocess the VoxCeleb dataset using VoxCeleb_preprocessing.
-
Invert real images into the latent space of the pretrained StyleGAN2 using the Encoder4Editing method.
python invert_images.py --input_path path/to/voxdataset
The dataset is saved as:
.path/to/voxdataset
|-- id10271 # identity index
| |-- 37nktPRUJ58 # video index
| | |-- frames_cropped # preprocessed frames
| | | |-- 00_000025.png
| | | |-- ...
| | |-- inversion
| | | |-- frames # inverted frames
| | | | |-- 00_000025.png
| | | | |-- ..
| | | |-- latent_codes # inverted latent_codes
| | | | |-- 00_000025.npy
| | | | |-- ..
| |-- Zjc7Xy7aT8c
| | | ...
|-- id10273
| | ...
The correct preprocessing of the dataset is important to reenact the images. Different preprocessing will lead in poor performance. Example:
To train our model, make sure to download and save the required models under ./pretrained_models
path and that the training and testing data are configured as described above. Please check run_trainer.py
and ./libs/configs/config_arguments.py
for the training arguments.
Example of training using paired data:
python run_trainer.py \
--experiment_path ./training_attempts/exp_v00 \
--train_dataset_path path_to_training_dataset \
--test_dataset_path path_to_test_dataset \
--training_method paired
Download our pretrained model A-matrix and save it under ./pretrained_models
path.
Given as input an image or a latent code, change only one facial attribute that corresponds to one of our learned directions.
python run_facial_editing.py \
--source_path ./inference_examples/0002775.png \
--output_path ./results/facial_editing \
--directions 0 1 2 3 4 \
--save_gif \
--optimize_generator
Given as input a source identity and a target video, reenact the source face. The source and target faces could have the same identity or different identity.
python run_inference.py \
--source_path ./inference_examples/0002775.png \
--target_path ./inference_examples/lWOTF8SdzJw#2614-2801.mp4 \
--output_path ./results/ \
--save_video
[1] Stella Bounareli, Argyriou Vasileios and Georgios Tzimiropoulos. Finding Directions in GAN's Latent Space for Neural Face Reenactment.
Bibtex entry:
@article{bounareli2022finding,
title={Finding Directions in GAN's Latent Space for Neural Face Reenactment},
author={Bounareli, Stella and Argyriou, Vasileios and Tzimiropoulos, Georgios},
journal={British Machine Vision Conference (BMVC)},
year={2022}
}