Skip to content

Latest commit

 

History

History
88 lines (63 loc) · 9.7 KB

MODEL_ZOO.md

File metadata and controls

88 lines (63 loc) · 9.7 KB

Model Zoo

Action Recognition

For action recognition, unless specified, models are trained on Kinetics-400. The version of Kinetics-400 we used contains 240436 training videos and 19796 testing videos. For TSN, we also train it on UCF-101, initialized with ImageNet pretrained weights. We also provide transfer learning results on UCF101 and HMDB51 for some algorithms. Models with * are converted from other repos(including VMZ and kinetics_i3d), others are trained by ourselves. If you reproduce our testing results due to dataset unalignment, please submit a request at get validation data.

TSN

Kinetics

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB ImageNet ResNet50 3seg 70.6 89.4 model

UCF101

Modality Pretrained Backbone Input Top-1 Download
RGB ImageNet BNInception 3seg 86.4 model
TV-L1 ImageNet BNInception 3seg 87.7 model

I3D

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB ImageNet Inception-V1 64x1 71.1 89.3 model*
RGB ImageNet ResNet50 32x2 72.9 90.8 model
Flow ImageNet Inception-V1 64x1 63.4 84.9 model*
Two-Stream ImageNet Inception-V1 64x1 74.2 91.3 /

SlowOnly

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB None ResNet50 4x16 72.9 90.9 model
RGB ImageNet ResNet50 4x16 73.8 90.9 model
RGB None ResNet50 8x8 74.8 91.9 model
RGB ImageNet ResNet50 8x8 75.7 92.2 model
RGB None ResNet101 8x8 76.5 92.7 model
RGB ImageNet ResNet101 8x8 76.8 92.8 model

SlowFast

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB None ResNet50 4x16 75.4 92.1 model
RGB ImageNet ResNet50 4x16 75.9 92.3 model

R(2+1)D

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB None ResNet34 8x8 63.7 85.9 model
RGB IG-65M ResNet34 8x8 74.4 91.7 model
RGB None ResNet34 32x2 71.8 90.4 model
RGB IG-65M ResNet34 32x2 80.3 94.7 model

CSN

Modality Pretrained Backbone Input Top-1 Top-5 Download
RGB IG-65M irCSN-152 32x2 82.6 95.7 model*
RGB IG-65M ipCSN-152 32x2 82.7 95.6 model*

Transfer Learning

Model Modality Pretrained Backbone Input UCF101 HMDB51 Download (split1)
I3D RGB Kinetics I3D 64x1 94.8 72.6 UCF101 / HMDB51
I3D Flow Kinetics I3D 64x1 96.6 79.2 UCF101 / HMDB51
I3D TwoStream Kinetics I3D 64x1 97.8 80.8 /

Action Detection

For action detection, we release models trained on THUMOS14.

SSN

Modality Pretrained Backbone mAP@0.10 mAP@0.20 mAP@0.30 mAP@0.40 mAP@0.50 Download
RGB ImageNet BNInception 43.09% 37.95% 32.56% 25.71% 18.33% model

Spatial Temporal Action Detection

For spatial temporal action detection, we release models trained on AVA.

Modality Model Pretrained Backbone mAP@0.5 Download
RGB Fast-RCNN Kinetics NL-I3D R50 21.2 model