Skip to content

Latest commit

 

History

History
71 lines (47 loc) · 4.24 KB

README.md

File metadata and controls

71 lines (47 loc) · 4.24 KB

vid2vid--paddle

About this project

This project is paddle implementation for few-shot photorealistic video-to-video translation. Go to this link for more details. This project is heavily copied from original project Few-shot vid2vid and Imaginaire. This work is made available under the Nvidia Source Code License (1-Way Commercial). To view a copy of this license, visit License. If this work is benifit for you, please cite:

@inproceedings{wang2018fewshotvid2vid,
author = {Ting-Chun Wang and Ming-Yu Liu and Andrew Tao and Guilin Liu and Jan Kautz and Bryan Catanzaro},
title = {Few-shot Video-to-Video Synthesis},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2019},
}

Dependencies

This project is totally conducted in AI Studio, Because of the lack of ecology in AI Studio, other dependencies should implemented first to support this project. Here I implemented

Dataset

YouTube Dancing Videos Dataset

YouTube Dancing Videos Dataset is a large scale dancing video dataset collected from YouTube site. Note that the dataset used in this project is slightly different from the original project with augmentation data collected from bilibili. In the end, 700 raw videos froms the raw video dataset. Then, I use openpose-paddle and densepose-paddle to extract pose annotations from the raw videos. Finially, I get 4240 video sequences, and 1,382,329 raw frames with corresponding pose annotations. This dataset is split into 4 subdataset for convenience of training and the constriction of storage. For more details, please go to preprocess and YouTube Dancing Videos Dataset.

For other dataset

Not implemented here!

Train

! cd /home/aistudio/vid2vid/ && python ./train.py --logdir /path/to/log/dictory/ \
                                   --max_epoch 20 \
                                   --max_iter_per_epoch 10000 \
                                   --num_epochs_temporal_step 4 \
                                   --train_data_root /path/to/dancing/video/dataset \
                                   --val_data_root /path/to/evaluation/dataset

To train your own dataset, please follow the instruction in preprocess to get your dataset first, then run command above. Be careful of mode collapse when training with highly inner-variance dataset.

Pretrained Models

Here I trained four models for 4 subdataset described above. Note that these four models are train on YouTube Dancing Videos Dataset, you need to finetune on your own dataset to synthesis your own videos.

I put these pretrained models to here. Model trained on set 4 have not be released currently.

Evaluation

! cd /home/aistudio/vid2vid/ && python ./evaluate.py --log_dir /path/to/evaluation/results/output/directory
                                     --checkpoint_logdir /path/to/checkpoints/directory
                                     --eval_data_dir /path/to/eval/data/directory

For example, to evaluate model trained on set-1, I can run

python ./evaluate.py --logdir /home/aistudio/vid2vid/outputs/evaluation/ 
               --checkpoint_logdir /home/aistudio/work/logs/checkpoints/ 
               --eval_data_dir /home/aistudio/data/data68795/home/aistudio/test_pose/1_pose/images/

Acknowledge

Thanks for the support of GPU resourses provided by AI Studio for this project.