MMVideoTextRetrieval

MMVideoTextRetrieval is an open source video-text retrieval toolbox based on PyTorch.

Introduction

This repository provides different video text retrieval methods.

Major Features

Modular design

We decompose the video-text retrieval framework into different components which can be easily used any combination.
Support for various datasets and features

The toolbox supports multiple datasets, such as MSRVTT, ActivityNet, LSMDC. Besides, various extracted features are provided.
Support for multiple video text retrieval frameworks

MMVideoTextRetrieval implements popular frameworks for video text retrieval, such as MMT, etc. More frameworks will be added later.
Visual demo

We provide the demo to visualize the results of video text retrieval models.

Demo

We provide a way to produce text-to-video retrieval in real-world applications. Before retrieval, the multi-model features of videos should be extracted and stored. The searched text is defined in the "main_train" function in demo.py, and the config "--sentence" should be used to activate the retrieval process. The outputs of the retrieval are the name of video feature files of the top 10 similar videos.

Benchmark

Model	Dataset	Video Feature	Text Feature	Pretrained	Text-to-Video Retrieval			Video-to-Text Retrieval
					R@1	R@5	R@10	R@1	R@5	R@10
MMT	MSTVTT-1kA	S3D	Bert	no	24.6	54	67.1	24.4	56	67.8
MMT	ActivityNet	S3D	Bert	no	22.7	54.2	93.2	22.9	54.8	93.1
MMT	LSMDC	S3D	Bert	no	13.2	29.2	38.8	12.1	29.3	37.9
MMT	MSTVTT-1kA&B	S3D	Bert	HowTo100M	26.6	57.1	69.6	27	57.5	69.7
MMT	ActivityNet	S3D	Bert	HowTo100M	28.7	61.4	94.5	28.9	61.1	94.3
MMT	LSMDC	S3D	Bert	HowTo100M	12.9	29.9	40.1	12.3	28.6	38.9
HGR	MSTVTT-Full	Resnet152	Word2Vec	no	9.2	26.2	36.5	15	36.7	48.8

(All the results are excerpted from the original paper and will be replaced by the results of pre-trained models later.)

Model Zoo

supported methods for Video Text retrieval.

MMT (ECCV'2020)
MMT-modified (ICMEW'2021)
HGR (CVPR'2020)

Dataset

supported datasets.

(click to collapse)

MSR-VTT
ActivityNet Captions
- raw dataset
- multi-modal features
LSMDC
- raw dataset
- multi-modal features
TGIF
- raw dataset
- Resnet152 video features
VATEX
- raw dataset
- I3D video features

Get stated

Requirements

Python 3.7

Pytorch 1.4.0 +
Transformers 3.1.0
Numpy 1.18.1

pip install -r requirements.txt

Training

Training + evaluation:

python -m demo --config configs/$model_name/$dataset_$split_trainval.json

Evaluation from checkpoint:

python -m demo --config configs/$model_name/$dataset_$split_trainval.json --only_eval --load_checkpoint $checkpoint_path

Training from pretrained model:

python -m demo --config configs/$model_name/prtrn_$dataset_$split_trainval.json --load_checkpoint $checkpoint_path

Retrieval videos with a specific sentence:

python -m demo --config configs/$model_name/$dataset_$split_trainval.json --only_eval --load_checkpoint $checkpoint_path --sentence

Using the modified version of MMT for training:

python -m demo --config configs/$model_name/prtrn_$dataset_$split_trainval.json --modified_model

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
base		base
configs_pub/eccv20		configs_pub/eccv20
data_loader		data_loader
figs		figs
model		model
trainer		trainer
utils		utils
.DS_Store		.DS_Store
LICENCE		LICENCE
README.md		README.md
demo.py		demo.py
parse_config.py		parse_config.py
requirements.txt		requirements.txt
rpf.yaml		rpf.yaml
rpf_edit.yaml		rpf_edit.yaml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMVideoTextRetrieval

Introduction

Major Features

Demo

Benchmark

Model Zoo

Dataset

Get stated

Requirements

Training

About

Releases

Packages

Contributors 2

Languages

License

shilrley6/MMVideoTextRetrieval

Folders and files

Latest commit

History

Repository files navigation

MMVideoTextRetrieval

Introduction

Major Features

Demo

Benchmark

Model Zoo

Dataset

Get stated

Requirements

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages