The PCA-CBVR (Prototypical Category Approximation Content-Based Video Retrieval) is a proposed method among CBVR applications.
The PCA-CBVR process consists of two-step. The first step is query video classification, in this step we utilize prototype technique. The second step is a fine-search on the database videos which corresponds to the estimated query video category.
The character of PCA-CBVR shows good performance on untrimmed videos such as ActivityNet.
Please, for more detailed things are referred to the paper.
python preprocess.py --args(Default) [choices]
- --db_root_path(./DB): the location of the DB
- --metadb_root_path(./MetaDB): the location of the MetaDB
- --model_root_path(./Models): the location of the Models
- --annotation_type(Categorical) [Categorical, Sampled]: the annotation type to build MetaDB
- --extract_target(UCF101) [UCF101, HMDB51, ActivityNet]: the target dataset to build MetaDB
- --model(R3D18) [R3D18, R3D34, R3D50, R2Plus1D18, R2Plus1D34, R2Plus1D50]: the model to use for feature extraction
- --tune_type(WithoutTune) [WithoutTune, Categorical, FewShot]: the weight as by fine-tuned types to use for feature extraction
- --tune_layer(1) [1, 2, 3, 4]: the number of residual blocks to use for feature extraction
- --tune_target(UCF101) [UCF101, HMDB51, ActivityNet]: the used target dataset when conducted fine-tuning
- --tune_way(5): the few-shot parameter
- --tune_shot(1): the few-shot parameter
- --frame_extractor_frame_size(240): adjust frame size when performing frame extraction
- --feature_extractor_frame_size(112): adjust frame size when performing feature extraction
- --sequence_length(16): the number of frames
- --batch_size(128): the batch size for feature extraction
- --num_workers(-1): the number of cores to use for feature extraction
- --gpu_number(0): the index number of the gpu to use for feature extraction
- --only_cpu(store_true): performing all processes on the cpu
☆ (tune_layer, tune_target, tune_way, tune_shot) options are depends on (tune_type) option
python finetuning.py --args(Default) [choices]
- --db_root_path(./DB): the location of the DB
- --model_root_path(./Models): the location of the Models
- --pretrained(store_true): using a model which pretrained on kinetics-700
- --tune_target(UCF101) [UCF101, HMDB51, ActivityNet]: the target dataset to train a model
- --model(R3D18) [R3D18, R3D34, R3D50, R2Plus1D18, R2Plus1D34, R2Plus1D50]: the model to use for training
- --tune_layer(1) [-1, 1, 2, 3, 4]: the number of residual blocks to use for training, -1 means using all
- --tune_type(Categorical) [Categorical, FewShot]: the type of learning strategy
- --train_iter_size(100): the number of train iteration for few-shot learning
- --val_iter_size(200): the number of val iteration for few-shot learning
- --tune_way(5): the few-shot parameter
- --tune_shot(1): the few-shot parameter
- --tune_query(10): the few-shot parameter
- --shortcut(B) [A, B]: resnet shortcut type
- --max_interval(-1): fix the frame sampling interval, -1 means does not fix
- --random_interval(store_true): use the random interval with range 0 to max_interval
- --random_start_position(store_true): use the random start position for frame sampling
- --uniform_frame_sample(store_true): use the uniform frame sampling strategy, will ignore the random_interval option if this option is activated
- --random_pad_sample(store_true): use the random pad sampling strategy; the default is to repeat the first
- --frame_size(112): adjust frame size when performing a training
- --sequence_length(16): the number of frames
- --batch_size(64): the batch size for training
- --num_epochs(100): the number of epochs for training
- --learning_rate(1e-3): the parameter for SGD
- --scheduler_step_size(10): the parameter for StepRL
- --scheduler_gamma(0.9): the parameter for StepRL
- --momentum(0.9): the parameter for SGD
- --weight_decay(1e-3): the parameter for SGD
- --gpu_number(0): the index number of the gpu to use for training
- --num_workers(4): the number of cores to use for training
- --cudnn_benchmark(store_true): activate cuDNN benchmark option, see BACKENDS
- --only_cpu(store_true): performing all process on the cpu
☆ (tune_way, tune_shot) options are depends on (tune_type) option
If you use this code in your work, please cite our work
@ARTICLE{PCA-CVBR2022,
author={Hyeok Yoon and Ji-Hyeong Han},
journal={IEEE Access},
title={Content-Based Video Retrieval With Prototypes of Deep Features},
year={2022},
volume={10},
pages={30730-30742},
doi={10.1109/ACCESS.2022.3160214},
}