Official PyTorch Implementation of the SVCNet Paper
Project | arXiv | IEEE Xplore
SVCNet is an architecture for scribble-based video colorization, which includes two sub-networks: CPNet and SSNet. This repo contains training and evaluation code for the following paper:
SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Yuzhi Zhao1, Lai-Man Po1, Kangcheng Liu2, Xuehui Wang3, Wing-Yin Yu1, Pengfei Xian1, Yujia Zhang4, Mengyang Liu4
1City University of Hong Kong, 2Nanyang Technological University, 3Shanghai Jiao Tong University, 4Tencent Video
IEEE Transactions on Image Processing (TIP), 2023
We test the code on CUDA 10.0 (higher version is also compatible). The basic requirements are as follows:
- pytorch==1.2.0
- torchvision==0.4.0
- cupy-cuda100
- python-opencv
- scipy
- scikit-image
If you use conda, the following command is helpful:
conda env create -f environment.yaml
conda activate svcnet
We upload the pre-trained SVCNet modules (including CPNet and SSNet) and other public pre-trained models (including PWCNet and VGG-16). By default we put all those files under a trained_models root.
All the pre-trained model files can be downloaded at this link.
Alternatively, you can download following files if you only want to do inference:
We use ImageNet, DAVIS, and Videvo datasets as our training set. Please cite the original papers if you use these datasets. We release zip files that contain those images. By default we put all those files under a data root.
We generate saliency maps as pseudo segmentation labels for images in the ImageNet and Videvo datasets. Note that, images in the DAVIS dataset have segmentation labels. The saliency detection method is Pyramid Feature Attention Network for Saliency detection. The generated saliency maps are also released.
All the ImageNet files can be downloaded at this link. All the DAVIS-Videvo files can be downloaded at this link. Alternatively, you can find each seperate file below:
-
CPNet: includes scripts and codes for training and validating CPNet
-
SSNet: includes scripts and codes for training SSNet and validating SVCNet
-
Evaluation: includes codes for evaluation (e.g., Tables II, IV, and V in the paper)
-
GCS: includes codes for generating validation color scribbles
We include a legacy video segment along with their corresponding color scribble frames with 4 different styles. The input grayscale frames and color scribbles are also included. You may find the code related to how to generate these color scribbles in GCS sub-folder. Users can easily reproduce the following results by running:
cd SSNet
python test.py
-
Creating your own scribbles (see GCS sub-folder). You need first provide the first color scribble; then, you can use generate_color_scribbles_video.py script to obtain the following scribbles based on the optical flows of your own grayscale video.
-
Inference with your generated scribbles (see SSNet sub-folder). Please follow the guide in the README file, e.g., running test.py.
A few video samples on the validation dataset are illustrated below:
Some codes are borrowed from the PyTorch-PFAN, SCGAN, VCGAN, PyTorch-PWC, and DEVC projects. Thanks for their awesome works.
If you think this work is helpful, please consider cite:
@article{zhao2023svcnet,
title={SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation},
author={Zhao, Yuzhi and Po, Lai-Man and Liu, Kangcheng and Wang, Xuehui and Yu, Wing-Yin and Xian, Pengfei and Zhang, Yujia and Liu, Mengyang},
journal={IEEE Transactions on Image Processing},
volume={32},
pages={4443-4458},
year={2023}
}