Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparisons for Recognizers #773

Open
1 of 4 tasks
irvingzhang0512 opened this issue Mar 31, 2021 · 5 comments
Open
1 of 4 tasks

Comparisons for Recognizers #773

irvingzhang0512 opened this issue Mar 31, 2021 · 5 comments
Assignees

Comments

@irvingzhang0512
Copy link
Contributor

Description

MMAction2 provide a large number of recognizers. In order to choose the right model for our applications, maybe we need to compare all models in one table. But I don't know how to do this.

I'm open to all suggestions.

Inference time statistics

  • Inference time is my priority, so here is a table for this.
  • Related codes could be found Here.
model_name Tesla V100-PCIE(32f, 16f, 8f) GTX 1080ti(32f, 16f, 8f) Jetson AGX Xavier(32f, 16f, 8f)
TSN_r50 31/17/10 52/26/14 258/134/80
TSM_r50 34/19/13 59/30/16 278/145/86
TSM_MobileNetV2 10/10/10 23/12/7 81/41/23
TIN_r50 72/30/30 141/52/24 561/218/104
TANet 37/21/19 64/33/19 429/165/100
I3D_r50 27/23/21 21/14/11 128/68/37
2Plus1d_r34 61/41/32 77/41/27 539/278/146
CSN_r152 172/169/169 115/92/88 584/303/163
SlowFast_r50 34/28/28 28/18/14 150/81/45
SlowOnly_r50 58/38/28 79/41/23 576/301/160
X3D 95/93/92 84/52/49 415/212/112

Notes:

  • The unit of inference time is millisecond(ms).
  • 32f, 16f, 8f means number of frames for model inputs.
    • Default input shape for 2D Recognizers is (1, num_frames, 3, 224, 224)
    • Default input shape for 3D Recognizers is (1, 1, 3, num_frames, 224, 224)
  • TPN models and C3D models are not involved yet.
    • TPN models are not valid for 32 frames.
    • C3D models only support input shape (1, 1, 3, 16, 112, 112).

TODO

  • Inference time for PyTorch models with default config.
  • Inference time for PyTorch/ONNX/TensorRT model with various configs
    • PyTorch models support fp16, fuse_conv_bn, cudnn, etc.
    • TensorRT models suppport fp16/int8.
  • Detailed information for each model, such as FLOPs, gpu memory, training/test results, etc.
  • Inference time for input preprocessing.
@innerlee
Copy link
Contributor

Thanks, this is great!

  • For TPN and C3D, is there a way to put up a setting that is as fair as possible? For example, if it supports 8 or 16 frames only, then you can forward 32 frames in one batch with batch size 4 and 2, respectively.
  • Generally there is a speed/accuracy trade-off. Reporting their accuracies on a common test set (e.g. pick 4000 vids from the test set of K400) would be helpful to evaluate any performance degradations for different precisions.

@irvingzhang0512
Copy link
Contributor Author

I went through the codes of C3D. It turns out that we cannot modify C3D config to support other input shapes.
I haven't study the codes of TPN, may take a look in April.

Maybe a table like this

model type model name sampling strategy v100/1080ti/agx latency(ms) kinetics400 accuracy sthv2 accuracy comments
PyTorch TSM-R50 1x1x8 13/16/86 70.24 / 89.56 57.86 / 61.12 /

@dreamerlin
Copy link
Collaborator

Notes: We will support Support auto_fp16 using torch.cuda.amp in the future open-mmlab/mmcv#791

@innerlee
Copy link
Contributor

Not really. Most info in the above table is already present in the modelzoo, or can be added to the tables (i.e. column v100/1080ti/agx latency) in modelzoo.

I think the most valuable part is the speed/accuracy benchmark for different precisions.

@irvingzhang0512
Copy link
Contributor Author

Something like that in GluonCV?
bokeh_plot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants