We provide backbone models pre-trained on Kinetics dataset, used for further fine-tuning on AVA dataset. The reported accuracy are obtained by 30-view testing.
backbone | pre-train | frame length | sample rate | top-1 | top-5 | model |
---|---|---|---|---|---|---|
SlowFast-R50 | Kinetics-700 | 4 | 16 | 66.34 | 86.66 | [link] |
SlowFast-R101 | Kinetics-700 | 8 | 8 | 69.32 | 88.84 | [link] |
config | backbone | IA structure | mAP | in paper | model |
---|---|---|---|---|---|
resnet50_4x16f_baseline | SlowFast-R50-4x16 | w/o | 26.7 | 26.5 | [link] |
resnet50_4x16f_parallel | SlowFast-R50-4x16 | Parallel | 29.0 | 28.9 | [link] |
resnet50_4x16f_serial | SlowFast-R50-4x16 | Serial | 29.8 | 29.6 | [link] |
resnet50_4x16f_denseserial | SlowFast-R50-4x16 | Dense Serial | 30.0 | 29.8 | [link] |
resnet101_8x8f_baseline | SlowFast-R101-8x8 | w/o | 29.3 | 29.3 | [link] |
resnet101_8x8f_denseserial | SlowFast-R101-8x8 | Dense Serial | 32.4 | 32.3 | [link] |