FSRT: Facial Scene Representation Transformer for Face Reenactment (CVPR 2024)

Official GitHub repository for FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features (accepted to CVPR 2024) by Andre Rochow, Max Schwarz, and Sven Behnke.

[Paper] [Project Page]

Example Animations

The animated sequences (bottom row) are generated by transferring the motion extracted from the driving video (left) to the person in the source image (top row).

VoxCeleb Dataset

The source images and driving videos are selected from the official VoxCeleb test set. We demonstrate both animation modes, absolute and relative motion transfer, supported by our method.

1. Absolute Motion Transfer

vox_abs.mp4

2. Relative Motion Transfer

vox_rel.mp4

CelebA-HQ Dataset

Our model generalizes to source images from the CelebA-HQ dataset and driving videos from the official VoxCeleb2 test set.

CelebA-HQ_rel_1.mp4

CelebA-HQ_rel_2.mp4

Setup

Please complete the following steps.

Clone the repository:

git clone https://github.com/andrerochow/fsrt.git
cd fsrt

We recommend to create a new conda environment:

conda create -n fsrt python=3.9
conda activate fsrt

Dependencies

This code requires at least Python 3.9 and PyTorch.

Install PyTorch (>= 1.12.0)
Additional dependencies can be installed via:
```
pip install -r requirements.txt
```
In case you want to animate with relative motion transfer and automatically find a best-matching frame, you need to install the face-alignment library:
```
git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install
```

Pretrained Checkpoints

Pretrained models can be found at google-drive.

The keypoint detector weights should be located at fsrt_checkpoints/kp_detector.pt. Note that all pretrained checkpoints are trained using the same keypoint detector weights.

Animation Demo

Animate with Relative Motion Transfer (Better ID Preservation):

python demo.py --checkpoint fsrt_checkpoints/vox256.pt --config runs/vox256/vox256.yaml  --source_image path/to/source --driving_video path/to/driving  --source_idx 0 --relative --adapt_scale --find_best_frame

Animate with Absolute Motion Transfer:

python demo.py --checkpoint fsrt_checkpoints/vox256.pt --config runs/vox256/vox256.yaml  --source_image path/to/source --driving_video path/to/driving --source_idx 0

VoxCeleb Dataset

Download the VoxCeleb dataset by following the instructions in this repository. We strongly recommend saving the videos in .mp4 format at the highest possible resolution, as they will be cropped to implement out-of-frame motion. In our case, we resized the larger dimension of each video to match the smaller dimension (e.g. 608x512 → 512x512).

Once the dataset is downloaded, run the extract_keypoints.py script on the videos:

python3 extract_keypoints.py --folder_in path/to/videos/ --folder_out path/to/output_folder/

This will store the face-alignment keypoints required for data augmentation.

Finally, split the training videos into path/to/data/train_videos/ for training and path/to/data/val_videos for validation.

Training

To train a FSRT model run:

torchrun --rdzv-backend=c10d --rdzv-endpoint=localhost:$PORT --nnodes 1 --nproc_per_node $NUM_GPUS train.py runs/vox256/vox256.yaml

The model checkpoints are automatically saved to the directory where the .yaml config file is located.

Acknowledgement

Our FSRT implementation ist based on the PyTorch implementation of Scene Representation Transformer and First Order Motion Model for Image Animation.

BibTeX

@inproceedings{rochow2024fsrt,
  title={{FSRT}: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features},
  author={Rochow, Andre and Schwarz, Max and Behnke, Sven},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={7716--7726},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FSRT: Facial Scene Representation Transformer for Face Reenactment (CVPR 2024)

Example Animations

VoxCeleb Dataset

1. Absolute Motion Transfer

2. Relative Motion Transfer

CelebA-HQ Dataset

Setup

Dependencies

Pretrained Checkpoints

Animation Demo

Animate with Relative Motion Transfer (Better ID Preservation):

Animate with Absolute Motion Transfer:

VoxCeleb Dataset

Training

Acknowledgement

BibTeX

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
modules		modules
runs		runs
srt		srt
README.md		README.md
demo.py		demo.py
extract_keypoints.py		extract_keypoints.py
requirements.txt		requirements.txt
train.py		train.py

andrerochow/fsrt

Folders and files

Latest commit

History

Repository files navigation

FSRT: Facial Scene Representation Transformer for Face Reenactment (CVPR 2024)

Example Animations

VoxCeleb Dataset

1. Absolute Motion Transfer

2. Relative Motion Transfer

CelebA-HQ Dataset

Setup

Dependencies

Pretrained Checkpoints

Animation Demo

Animate with Relative Motion Transfer (Better ID Preservation):

Animate with Absolute Motion Transfer:

VoxCeleb Dataset

Training

Acknowledgement

BibTeX

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages