faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes' auto-encoder architecture.

Updates

Date	Update
2018-03-17	Training: V2 model now provides a 40000-iter training schedule which automatically switches to proper loss functions at predefined iterations. (Cage/Trump dataset results)
2018-03-13	Model architecture: V2.1 model now provides 3 base architectures: (i) XGAN, (ii) VAE-GAN, and (iii) a variant of v2 GAN. See "4. Training Phase Configuration" in v2.1 notebook for detail.
2018-03-03	Model architecture: Add a new notebook which contains an improved GAN architecture. The architecture is greatly inspired by XGAN and MS-D neural network.
2018-02-13	Video conversion: Add a new video procesisng script using MTCNN for face detection. Faster detection with configurable threshold value. No need of CUDA supported dlib. (New notebook: v2_test_vodeo_MTCNN)

Descriptions

GAN-v2

FaceSwap_GAN_v2_train.ipynb (recommned for trainnig)
- Script for training the version 2 GAN model.
- Video conversion functions are also included.
FaceSwap_GAN_v2_test_video.ipynb
- Script for generating videos.
- Using face_recognition module for face detection.
FaceSwap_GAN_v2_test_video_MTCNN.ipynb (recommned for video conversion)
- Script for generating videos.
- Using MTCNN for face detection. Does not reqiure CUDA supported dlib.
faceswap_WGAN-GP_keras_github.ipynb
- This notebook contains a class of GAN mdoel using WGAN-GP.
- Perceptual loss is discarded for simplicity.
- The WGAN-GP model gave me similar result with LSGAN model after tantamount (~18k) generator updates.
```
gan = FaceSwapGAN() # instantiate the class
gan.train(max_iters=10e4, save_interval=500) # start training
```
FaceSwap_GAN_v2_sz128_train.ipynb
- Input and output images have larger shape (128, 128, 3).
- Minor updates on the architectures:
  1. Add instance normalization to generators and discriminators.
  2. Add additional regressoin loss (mae loss) on 64x64 branch output.
- Not compatible with _test_video and _test_video_MTCNN notebooks above.

Miscellaneous

dlib_video_face_detection.ipynb
1. Detect/Crop faces in a video using dlib's cnn model.
2. Pack cropped face images into a zip file.
Training data: Face images are supposed to be in ./faceA/ and ./faceB/ folder for each target respectively. Face images can be of any size.

Results

Generative Adversarial Network, GAN (version 2)

Improved output quality: Adversarial loss improves reconstruction quality of generated images.
VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face.
Smoothed bounding box (Smoothed bbox): Exponential moving average of bounding box position over frames is introduced to eliminate jitter on the swapped face.
Unsupervised segmentation mask: Model learns a proper mask that helps on handling occlusion, eliminating artifacts on bbox edges, and producing natrual skin tone. In below are results transforming Hinako Sano (佐野ひなこ) to Emi Takei (武井咲).
- From left to right: source face, swapped face (before masking), swapped face (after masking).
- From left to right: source face, swapped face (after masking), mask heatmap.

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

Optional 128x128 input/output resolution: Increase input and output size from 64x64 to 128x128.
Face detection/tracking using MTCNN and Kalman filter: More stable detection and smooth tracking.
Training schedule: V2 model provides a predefined training schedule. The Trump/Cage results above are generated by model trained for 21k iters using TOTAL_ITERS = 30000 predefined training schedule.
V2.1 update: An improved architecture is updated in order to stablize training. The architecture is greatly inspired by XGAN ~~and MS-D neural network~~.
- V2.1 model provides three base architectures: (i) XGAN, (ii) VAE-GAN, and (iii) a variant of v2 GAN. (default base_model="GAN")
- Add more discriminators/losses to the GAN. To be specific, they are:
  1. GAN loss for non-masked outputs (common): Add two more discriminators to non-masked outputs.
  2. Perceptual adversarial loss (common): Feature level L1 loss which improves semantic detail.
  3. Domain-adversarial loss (XGAN): "It encourages the embeddings learned by the encoder to lie in the same subspace"
  4. Semantic consistency loss (XGAN): Loss of cosine distance of embeddings to preserve semantic of input.
  5. KL loss (VAE-GAN): KL divergence between N(0,1) and latent vector.
- ~~One res_block in the decoder is replaced by MS-D network (default depth = 16) for output refinement~~.
  - ~~This is a very inefficient implementation of MS-D network.~~ MS-D network is not included for now.
- Preview images are saved in ./previews folder.
- (WIP) Random motion blur as data augmentation, reducing ghost effect in output video.
- FCN8s for face segmentation is introduced to improve masking in video conversion (default use_FCN_mask = True).
  - To enable this feature, keras weights file should be generated through jupyter notebook provided in this repo.

Frequently asked questions

1. Slow video processing / OOM error?

It is likely due to too high resolution of input video, modify the parameters in step 13 or 14 will solve it.
- First, increase video_scaling_offset = 0 to 1 or higher.
- If it doesn't help, set manually_downscale = True.
- If the above still do not help, disable CNN model for face detectoin.
```
def process_video(...):
  ...
  #faces = get_faces_bbox(image, model="cnn") # Use CNN model
  faces = get_faces_bbox(image, model='hog') # Use default Haar features.  
```

2. How does it work?

The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this.

3. No audio in output clips?

Set audio=True in the video making cell.

output = 'OUTPUT_VIDEO.mp4'
clip1 = VideoFileClip("INPUT_VIDEO.mp4")
clip = clip1.fl_image(process_video)
%time clip.write_videofile(output, audio=True) # Set audio=True

4. Previews look good, but video result does not seem to transform the face?

Default setting transfroms face B to face A.
To transform face A to face B, modify the following parameters depending on your current running notebook:
- Change path_abgr_A to path_abgr_B in process_video() (step 13/14 of v2_train.ipynb and v2_sz128_train.ipynb).
- Change whom2whom = "BtoA" to whom2whom = "AtoB" (step 12 of v2_test_video.ipynb).

Requirements

keras 2
Tensorflow 1.3
Python 3
OpenCV
keras-vggface
moviepy
dlib (optional)
face_recognition (optinoal)

Acknowledgments

Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and deepfakes. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
gifs		gifs
mtcnn_weights		mtcnn_weights
notes		notes
readme_imgs		readme_imgs
temp		temp
FCN8s_keras.py		FCN8s_keras.py
FaceSwap_GAN_github.ipynb		FaceSwap_GAN_github.ipynb
FaceSwap_GAN_v2.1_train.ipynb		FaceSwap_GAN_v2.1_train.ipynb
FaceSwap_GAN_v2_sz128_train.ipynb		FaceSwap_GAN_v2_sz128_train.ipynb
FaceSwap_GAN_v2_test_img.ipynb		FaceSwap_GAN_v2_test_img.ipynb
FaceSwap_GAN_v2_test_video.ipynb		FaceSwap_GAN_v2_test_video.ipynb
FaceSwap_GAN_v2_test_video_MTCNN.ipynb		FaceSwap_GAN_v2_test_video_MTCNN.ipynb
FaceSwap_GAN_v2_train.ipynb		FaceSwap_GAN_v2_train.ipynb
README.md		README.md
dlib_video_face_detection.ipynb		dlib_video_face_detection.ipynb
image_augmentation.py		image_augmentation.py
instance_normalization.py		instance_normalization.py
model_GAN_v2.py		model_GAN_v2.py
mtcnn_detect_face.py		mtcnn_detect_face.py
pixel_shuffler.py		pixel_shuffler.py
training_data.py		training_data.py
umeyama.py		umeyama.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

faceswap-GAN

Updates

Descriptions

GAN-v2

Miscellaneous

Results

Generative Adversarial Network, GAN (version 2)

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

Frequently asked questions

1. Slow video processing / OOM error?

2. How does it work?

3. No audio in output clips?

4. Previews look good, but video result does not seem to transform the face?

Requirements

Acknowledgments

About

Releases

Packages

Languages

PhilipMantrov/faceswap-GAN

Folders and files

Latest commit

History

Repository files navigation

faceswap-GAN

Updates

Descriptions

GAN-v2

Miscellaneous

Results

Generative Adversarial Network, GAN (version 2)

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

Frequently asked questions

1. Slow video processing / OOM error?

2. How does it work?

3. No audio in output clips?

4. Previews look good, but video result does not seem to transform the face?

Requirements

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages