Skip to content

Latest commit

 

History

History
653 lines (478 loc) · 117 KB

ImageNet_models_en.md

File metadata and controls

653 lines (478 loc) · 117 KB

ImageNet Model zoo overview

Catalogue

1. Model library overview diagram

Based on the ImageNet-1k classification dataset, the 37 classification network structures supported by PaddleClas and the corresponding 217 image classification pretrained models are shown below. Training trick, a brief introduction to each series of network structures, and performance evaluation will be shown in the corresponding chapters. The evaluation environment is as follows.

  • Arm CPU evaluation environment is based on Snapdragon 855 (SD855).
  • Intel CPU evaluation environment is based on Intel(R) Xeon(R) Gold 6148.
  • The GPU evaluation speed is measured by running 2100 times under the FP32+TensorRT configuration (excluding the warmup time of the first 100 times).
  • FLOPs and Params are calculated by paddle.flops() (PaddlePaddle version is 2.2)

Curves of accuracy to the inference time of common server-side models are shown as follows.

Curves of accuracy to the inference time of common mobile-side models are shown as follows.

Curves of accuracy to the inference time of some VisionTransformer models are shown as follows.

2. SSLD pretrained models

Accuracy and inference time of the prtrained models based on SSLD distillation are as follows. More detailed information can be refered to SSLD distillation tutorial.

2.1 Server-side knowledge distillation model

Model Top-1 Acc Reference
Top-1 Acc
Acc gain time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ResNet34_vd_ssld 0.797 0.760 0.037 2.00 3.28 5.84 3.93 21.84 Download link   Download link  
ResNet50_vd_ssld 0.830 0.792 0.039 2.60 4.86 7.63 4.35 25.63 Download link Download link
ResNet101_vd_ssld 0.837 0.802 0.035 4.43 8.25 12.60 8.08 44.67 Download link Download link
Res2Net50_vd_26w_4s_ssld 0.831 0.798 0.033 3.59 6.35 9.50 4.28 25.76 Download link Download link
Res2Net101_vd_
26w_4s_ssld
0.839 0.806 0.033 6.34 11.02 16.13 8.35 45.35 Download link Download link
Res2Net200_vd_
26w_4s_ssld
0.851 0.812 0.049 11.45 19.77 28.81 15.77 76.44 Download link Download link
HRNet_W18_C_ssld 0.812 0.769 0.043 6.66 8.94 11.95 4.32 21.35 Download link Download link
HRNet_W48_C_ssld 0.836 0.790 0.046 11.07 17.06 27.28 17.34 77.57 Download link Download link
SE_HRNet_W64_C_ssld 0.848 - - 17.11 26.87 43.24 29.00 129.12 Download link Download link

2.2 Mobile-side knowledge distillation model

Model Top-1 Acc Reference
Top-1 Acc
Acc gain SD855 time(ms)
bs=1, thread=1
SD855 time(ms)
bs=1, thread=2
SD855 time(ms)
bs=1, thread=4
FLOPs(M) Params(M) Model大小(M) Pretrained Model Download Address Inference Model Download Address
MobileNetV1_ssld 0.779 0.710 0.069 30.24 17.86 10.30 578.88 4.25 16 Download link Download link
MobileNetV2_ssld 0.767 0.722 0.045 20.74 12.71 8.10 327.84 3.54 14 Download link Download link
MobileNetV3_small_x0_35_ssld 0.556 0.530 0.026 2.23 1.66 1.43 14.56 1.67 6.9 Download link Download link
MobileNetV3_large_x1_0_ssld 0.790 0.753 0.036 16.55 10.09 6.84 229.66 5.50 21 Download link Download link
MobileNetV3_small_x1_0_ssld 0.713 0.682 0.031 5.63 3.65 2.60 63.67 2.95 12 Download link Download link
GhostNet_x1_3_ssld 0.794 0.757 0.037 19.16 12.25 9.40 236.89 7.38 29 Download link Download link

2.3 Intel-CPU-side knowledge distillation model

Model Top-1 Acc Reference
Top-1 Acc
Acc gain Intel-Xeon-Gold-6148 time(ms)
bs=1
FLOPs(M) Params(M) Pretrained Model Download Address Inference Model Download Address
PPLCNet_x0_5_ssld 0.661 0.631 0.030 2.05 47.28 1.89 Download link Download link
PPLCNet_x1_0_ssld 0.744 0.713 0.033 2.46 160.81 2.96 Download link Download link
PPLCNet_x2_5_ssld 0.808 0.766 0.042 5.39 906.49 9.04 Download link Download link
  • Note: Reference Top-1 Acc means the accuracy of the pre-trained model obtained by PaddleClas based on ImageNet1k dataset training.

3. PP-LCNet series [28]

The accuracy and speed indicators of the PP-LCNet series models are shown in the following table. For more information about this series of models, please refer to: PP-LCNet series model documents

Model Top-1 Acc Top-5 Acc Intel-Xeon-Gold-6148 time(ms)
bs=1
FLOPs(M) Params(M) Pretrained Model Download Address Inference Model Download Address
PPLCNet_x0_25 0.5186 0.7565 1.61785 18.25 1.52 Download link Download link
PPLCNet_x0_35 0.5809 0.8083 2.11344 29.46 1.65 Download link Download link
PPLCNet_x0_5 0.6314 0.8466 2.72974 47.28 1.89 Download link Download link
PPLCNet_x0_75 0.6818 0.8830 4.51216 98.82 2.37 Download link Download link
PPLCNet_x1_0 0.7132 0.9003 6.49276 160.81 2.96 Download link Download link
PPLCNet_x1_5 0.7371 0.9153 12.2601 341.86 4.52 Download link Download link
PPLCNet_x2_0 0.7518 0.9227 20.1667 590 6.54 Download link Download link
PPLCNet_x2_5 0.7660 0.9300 29.595 906 9.04 Download link Download link

4. ResNet series [1]

The accuracy and speed indicators of ResNet and ResNet_vd series models are shown in the following table. For more information about this series of models, please refer to: ResNet and ResNet_vd series model documents

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ResNet18 0.7098 0.8992 1.22 2.19 3.63 1.83 11.70 Download link Download link
ResNet18_vd 0.7226 0.9080 1.26 2.28 3.89 2.07 11.72 Download link Download link
ResNet34 0.7457 0.9214 1.97 3.25 5.70 3.68 21.81 Download link Download link
ResNet34_vd 0.7598 0.9298 2.00 3.28 5.84 3.93 21.84 Download link Download link
ResNet34_vd_ssld 0.7972 0.9490 2.00 3.28 5.84 3.93 21.84 Download link Download link
ResNet50 0.7650 0.9300 2.54 4.79 7.40 4.11 25.61 Download link Download link
ResNet50_vc 0.7835 0.9403 2.57 4.83 7.52 4.35 25.63 Download link Download link
ResNet50_vd 0.7912 0.9444 2.60 4.86 7.63 4.35 25.63 Download link Download link
ResNet101 0.7756 0.9364 4.37 8.18 12.38 7.83 44.65 Download link Download link
ResNet101_vd 0.8017 0.9497 4.43 8.25 12.60 8.08 44.67 Download link Download link
ResNet152 0.7826 0.9396 6.05 11.41 17.33 11.56 60.34 Download link Download link
ResNet152_vd 0.8059 0.9530 6.11 11.51 17.59 11.80 60.36 Download link Download link
ResNet200_vd 0.8093 0.9533 7.70 14.57 22.16 15.30 74.93 Download link Download link
ResNet50_vd_
ssld
0.8300 0.9640 2.60 4.86 7.63 4.35 25.63 Download link Download link
ResNet101_vd_
ssld
0.8373 0.9669 4.43 8.25 12.60 8.08 44.67 Download link Download link

5. Mobile series [3][4][5][6][23]

The accuracy and speed indicators of the mobile series models are shown in the following table. For more information about this series, please refer to: Mobile series model documents

Model Top-1 Acc Top-5 Acc SD855 time(ms)
bs=1, thread=1
SD855 time(ms)
bs=1, thread=2
SD855 time(ms)
bs=1, thread=4
FLOPs(M) Params(M) Model大小(M) Pretrained Model Download Address Inference Model Download Address
MobileNetV1_
x0_25
0.5143 0.7546 2.88 1.82 1.26 43.56 0.48 1.9 Download link Download link
MobileNetV1_
x0_5
0.6352 0.8473 8.74 5.26 3.09 154.57 1.34 5.2 Download link Download link
MobileNetV1_
x0_75
0.6881 0.8823 17.84 10.61 6.21 333.00 2.60 10 Download link Download link
MobileNetV1 0.7099 0.8968 30.24 17.86 10.30 578.88 4.25 16 Download link Download link
MobileNetV1_
ssld
0.7789 0.9394 30.24 17.86 10.30 578.88 4.25 16 Download link Download link
MobileNetV2_
x0_25
0.5321 0.7652 3.46 2.51 2.03 34.18 1.53 6.1 Download link Download link
MobileNetV2_
x0_5
0.6503 0.8572 7.69 4.92 3.57 99.48 1.98 7.8 Download link Download link
MobileNetV2_
x0_75
0.6983 0.8901 13.69 8.60 5.82 197.37 2.65 10 Download link Download link
MobileNetV2 0.7215 0.9065 20.74 12.71 8.10 327.84 3.54 14 Download link Download link
MobileNetV2_
x1_5
0.7412 0.9167 40.79 24.49 15.50 702.35 6.90 26 Download link Download link
MobileNetV2_
x2_0
0.7523 0.9258 67.50 40.03 25.55 1217.25 11.33 43 Download link Download link
MobileNetV2_
ssld
0.7674 0.9339 20.74 12.71 8.10 327.84 3.54 14 Download link Download link
MobileNetV3_
large_x1_25
0.7641 0.9295 24.52 14.76 9.89 362.70 7.47 29 Download link Download link
MobileNetV3_
large_x1_0
0.7532 0.9231 16.55 10.09 6.84 229.66 5.50 21 Download link Download link
MobileNetV3_
large_x0_75
0.7314 0.9108 11.53 7.06 4.94 151.70 3.93 16 Download link Download link
MobileNetV3_
large_x0_5
0.6924 0.8852 6.50 4.22 3.15 71.83 2.69 11 Download link Download link
MobileNetV3_
large_x0_35
0.6432 0.8546 4.43 3.11 2.41 40.90 2.11 8.6 Download link Download link
MobileNetV3_
small_x1_25
0.7067 0.8951 7.88 4.91 3.45 100.07 3.64 14 Download link Download link
MobileNetV3_
small_x1_0
0.6824 0.8806 5.63 3.65 2.60 63.67 2.95 12 Download link Download link
MobileNetV3_
small_x0_75
0.6602 0.8633 4.50 2.96 2.19 46.02 2.38 9.6 Download link Download link
MobileNetV3_
small_x0_5
0.5921 0.8152 2.89 2.04 1.62 22.60 1.91 7.8 Download link Download link
MobileNetV3_
small_x0_35
0.5303 0.7637 2.23 1.66 1.43 14.56 1.67 6.9 Download link Download link
MobileNetV3_
small_x0_35_ssld
0.5555 0.7771 2.23 1.66 1.43 14.56 1.67 6.9 Download link Download link
MobileNetV3_
large_x1_0_ssld
0.7896 0.9448 16.55 10.09 6.84 229.66 5.50 21 Download link Download link
MobileNetV3_small_
x1_0_ssld
0.7129 0.9010 5.63 3.65 2.60 63.67 2.95 12 Download link Download link
ShuffleNetV2 0.6880 0.8845 9.72 5.97 4.13 148.86 2.29 9 Download link Download link
ShuffleNetV2_
x0_25
0.4990 0.7379 1.94 1.53 1.43 18.95 0.61 2.7 Download link Download link
ShuffleNetV2_
x0_33
0.5373 0.7705 2.23 1.70 1.79 24.04 0.65 2.8 Download link Download link
ShuffleNetV2_
x0_5
0.6032 0.8226 3.67 2.63 2.06 42.58 1.37 5.6 Download link Download link
ShuffleNetV2_
x1_5
0.7163 0.9015 17.21 10.56 6.81 301.35 3.53 14 Download link Download link
ShuffleNetV2_
x2_0
0.7315 0.9120 31.21 18.98 11.65 571.70 7.40 28 Download link Download link
ShuffleNetV2_
swish
0.7003 0.8917 31.21 9.06 5.74 148.86 2.29 9.1 Download link Download link
GhostNet_
x0_5
0.6688 0.8695 5.28 3.95 3.29 46.15 2.60 10 Download link Download link
GhostNet_
x1_0
0.7402 0.9165 12.89 8.66 6.72 148.78 5.21 20 Download link Download link
GhostNet_
x1_3
0.7579 0.9254 19.16 12.25 9.40 236.89 7.38 29 Download link Download link
GhostNet_
x1_3_ssld
0.7938 0.9449 19.16 12.25 9.40 236.89 7.38 29 Download link Download link
ESNet_x0_25 0.6248 0.8346 4.12 2.97 2.51 30.85 2.83 11 Download link Download link
ESNet_x0_5 0.6882 0.8804 6.45 4.42 3.35 67.31 3.25 13 Download link Download link
ESNet_x0_75 0.7224 0.9045 9.59 6.28 4.52 123.74 3.87 15 Download link Download link
ESNet_x1_0 0.7392 0.9140 13.67 8.71 5.97 197.33 4.64 18 Download link Download link

6. SEResNeXt and Res2Net series [7][8][9]

The accuracy and speed indicators of the SEResNeXt and Res2Net series models are shown in the following table. For more information about the models of this series, please refer to: SEResNeXt and Res2Net series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
Res2Net50_
26w_4s
0.7933 0.9457 3.52 6.23 9.30 4.28 25.76 Download link Download link
Res2Net50_vd_
26w_4s
0.7975 0.9491 3.59 6.35 9.50 4.52 25.78 Download link Download link
Res2Net50_
14w_8s
0.7946 0.9470 4.39 7.21 10.38 4.20 25.12 Download link Download link
Res2Net101_vd_
26w_4s
0.8064 0.9522 6.34 11.02 16.13 8.35 45.35 Download link Download link
Res2Net200_vd_
26w_4s
0.8121 0.9571 11.45 19.77 28.81 15.77 76.44 Download link Download link
Res2Net200_vd_
26w_4s_ssld
0.8513 0.9742 11.45 19.77 28.81 15.77 76.44 Download link Download link
ResNeXt50_
32x4d
0.7775 0.9382 5.07 8.49 12.02 4.26 25.10 Download link Download link
ResNeXt50_vd_
32x4d
0.7956 0.9462 5.29 8.68 12.33 4.50 25.12 Download link Download link
ResNeXt50_
64x4d
0.7843 0.9413 9.39 13.97 20.56 8.02 45.29 Download link Download link
ResNeXt50_vd_
64x4d
0.8012 0.9486 9.75 14.14 20.84 8.26 45.31 Download link Download link
ResNeXt101_
32x4d
0.7865 0.9419 11.34 16.78 22.80 8.01 44.32 Download link Download link
ResNeXt101_vd_
32x4d
0.8033 0.9512 11.36 17.01 23.07 8.25 44.33 Download link Download link
ResNeXt101_
64x4d
0.7835 0.9452 21.57 28.08 39.49 15.52 83.66 Download link Download link
ResNeXt101_vd_
64x4d
0.8078 0.9520 21.57 28.22 39.70 15.76 83.68 Download link Download link
ResNeXt152_
32x4d
0.7898 0.9433 17.14 25.11 33.79 11.76 60.15 Download link Download link
ResNeXt152_vd_
32x4d
0.8072 0.9520 16.99 25.29 33.85 12.01 60.17 Download link Download link
ResNeXt152_
64x4d
0.7951 0.9471 33.07 42.05 59.13 23.03 115.27 Download link Download link
ResNeXt152_vd_
64x4d
0.8108 0.9534 33.30 42.41 59.42 23.27 115.29 Download link Download link
SE_ResNet18_vd 0.7333 0.9138 1.48 2.70 4.32 2.07 11.81 Download link Download link
SE_ResNet34_vd 0.7651 0.9320 2.42 3.69 6.29 3.93 22.00 Download link Download link
SE_ResNet50_vd 0.7952 0.9475 3.11 5.99 9.34 4.36 28.16 Download link Download link
SE_ResNeXt50_
32x4d
0.7844 0.9396 6.39 11.01 14.94 4.27 27.63 Download link Download link
SE_ResNeXt50_vd_
32x4d
0.8024 0.9489 7.04 11.57 16.01 5.64 27.76 Download link Download link
SE_ResNeXt101_
32x4d
0.7939 0.9443 13.31 21.85 28.77 8.03 49.09 Download link Download link
SENet154_vd 0.8140 0.9548 34.83 51.22 69.74 24.45 122.03 Download link Download link

7. DPN and DenseNet series [14][15]

The accuracy and speed indicators of the DPN and DenseNet series models are shown in the following table. For more information about the models of this series, please refer to: DPN and DenseNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
DenseNet121 0.7566 0.9258 3.40 6.94 9.17 2.87 8.06 Download link Download link
DenseNet161 0.7857 0.9414 7.06 14.37 19.55 7.79 28.90 Download link Download link
DenseNet169 0.7681 0.9331 5.00 10.29 12.84 3.40 14.31 Download link Download link
DenseNet201 0.7763 0.9366 6.38 13.72 17.17 4.34 20.24 Download link Download link
DenseNet264 0.7796 0.9385 9.34 20.95 25.41 5.82 33.74 Download link Download link
DPN68 0.7678 0.9343 8.18 11.40 14.82 2.35 12.68 Download link Download link
DPN92 0.7985 0.9480 12.48 20.04 25.10 6.54 37.79 Download link Download link
DPN98 0.8059 0.9510 14.70 25.55 35.12 11.728 61.74 Download link Download link
DPN107 0.8089 0.9532 19.46 35.62 50.22 18.38 87.13 Download link Download link
DPN131 0.8070 0.9514 19.64 34.60 47.42 16.09 79.48 Download link Download link

8. HRNet series [13]

The accuracy and speed indicators of the HRNet series models are shown in the following table. For more information about the models of this series, please refer to: HRNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
HRNet_W18_C 0.7692 0.9339 6.66 8.94 11.95 4.32 21.35 Download link Download link
HRNet_W18_C_ssld 0.81162 0.95804 6.66 8.94 11.95 4.32 21.35 Download link Download link
HRNet_W30_C 0.7804 0.9402 8.61 11.40 15.23 8.15 37.78 Download link Download link
HRNet_W32_C 0.7828 0.9424 8.54 11.58 15.57 8.97 41.30 Download link Download link
HRNet_W40_C 0.7877 0.9447 9.83 15.02 20.92 12.74 57.64 Download link Download link
HRNet_W44_C 0.7900 0.9451 10.62 16.18 25.92 14.94 67.16 Download link Download link
HRNet_W48_C 0.7895 0.9442 11.07 17.06 27.28 17.34 77.57 Download link Download link
HRNet_W48_C_ssld 0.8363 0.9682 11.07 17.06 27.28 17.34 77.57 Download link Download link
HRNet_W64_C 0.7930 0.9461 13.82 21.15 35.51 28.97 128.18 Download link Download link
SE_HRNet_W64_C_ssld 0.8475 0.9726 17.11 26.87 43.24 29.00 129.12 Download link Download link

9. Inception series [10][11][12][26]

The accuracy and speed indicators of the Inception series models are shown in the following table. For more information about this series of models, please refer to: Inception series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
GoogLeNet 0.7070 0.8966 1.41 3.25 5.00 1.44 11.54 Download link Download link
Xception41 0.7930 0.9453 3.58 8.76 16.61 8.57 23.02 Download link Download link
Xception41_deeplab 0.7955 0.9438 3.81 9.16 17.20 9.28 27.08 Download link Download link
Xception65 0.8100 0.9549 5.45 12.78 24.53 13.25 36.04 Download link Download link
Xception65_deeplab 0.8032 0.9449 5.65 13.08 24.61 13.96 40.10 Download link Download link
Xception71 0.8111 0.9545 6.19 15.34 29.21 16.21 37.86 Download link Download link
InceptionV3 0.7914 0.9459 4.78 8.53 12.28 5.73 23.87 Download link Download link
InceptionV4 0.8077 0.9526 8.93 15.17 21.56 12.29 42.74 Download link Download link

10. EfficientNet and ResNeXt101_wsl series [16][17]

The accuracy and speed indicators of the EfficientNet and ResNeXt101_wsl series models are shown in the following table. For more information about this series of models, please refer to: EfficientNet and ResNeXt101_wsl series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ResNeXt101_
32x8d_wsl
0.8255 0.9674 13.55 23.39 36.18 16.48 88.99 Download link Download link
ResNeXt101_
32x16d_wsl
0.8424 0.9726 21.96 38.35 63.29 36.26 194.36 Download link Download link
ResNeXt101_
32x32d_wsl
0.8497 0.9759 37.28 76.50 121.56 87.28 469.12 Download link Download link
ResNeXt101_
32x48d_wsl
0.8537 0.9769 55.07 124.39 205.01 153.57 829.26 Download link Download link
Fix_ResNeXt101_
32x48d_wsl
0.8626 0.9797 55.01 122.63 204.66 313.41 829.26 Download link Download link
EfficientNetB0 0.7738 0.9331 1.96 3.71 5.56 0.40 5.33 Download link Download link
EfficientNetB1 0.7915 0.9441 2.88 5.40 7.63 0.71 7.86 Download link Download link
EfficientNetB2 0.7985 0.9474 3.26 6.20 9.17 1.02 9.18 Download link Download link
EfficientNetB3 0.8115 0.9541 4.52 8.85 13.54 1.88 12.324 Download link Download link
EfficientNetB4 0.8285 0.9623 6.78 15.47 24.95 4.51 19.47 Download link Download link
EfficientNetB5 0.8362 0.9672 10.97 27.24 45.93 10.51 30.56 Download link Download link
EfficientNetB6 0.8400 0.9688 17.09 43.32 76.90 19.47 43.27 Download link Download link
EfficientNetB7 0.8430 0.9689 25.91 71.23 128.20 38.45 66.66 Download link Download link
EfficientNetB0_
small
0.7580 0.9258 1.24 2.59 3.92 0.40 4.69 Download link Download link

11. ResNeSt and RegNet series [24][25]

The accuracy and speed indicators of the ResNeSt and RegNet series models are shown in the following table. For more information about the models of this series, please refer to: ResNeSt and RegNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ResNeSt50_
fast_1s1x64d
0.8035 0.9528 2.73 5.33 8.24 4.36 26.27 Download link Download link
ResNeSt50 0.8083 0.9542 7.36 10.23 13.84 5.40 27.54 Download link Download link
RegNetX_4GF 0.785 0.9416 6.46 8.48 11.45 4.00 22.23 Download link Download link

12. ViT and DeiT series [31][32]

The accuracy and speed indicators of ViT (Vision Transformer) and DeiT (Data-efficient Image Transformers) series models are shown in the following table. For more information about this series of models, please refer to: ViT_and_DeiT series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ViT_small_
patch16_224
0.7769 0.9342 3.71 9.05 16.72 9.41 48.60 Download link Download link
ViT_base_
patch16_224
0.8195 0.9617 6.12 14.84 28.51 16.85 86.42 Download link Download link
ViT_base_
patch16_384
0.8414 0.9717 14.15 48.38 95.06 49.35 86.42 Download link Download link
ViT_base_
patch32_384
0.8176 0.9613 4.94 13.43 24.08 12.66 88.19 Download link Download link
ViT_large_
patch16_224
0.8323 0.9650 15.53 49.50 94.09 59.65 304.12 Download link Download link
ViT_large_
patch16_384
0.8513 0.9736 39.51 152.46 304.06 174.70 304.12 Download link Download link
ViT_large_
patch32_384
0.8153 0.9608 11.44 36.09 70.63 44.24 306.48 Download link Download link
Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
DeiT_tiny_
patch16_224
0.718 0.910 3.61 3.94 6.10 1.07 5.68 Download link Download link
DeiT_small_
patch16_224
0.796 0.949 3.61 6.24 10.49 4.24 21.97 Download link Download link
DeiT_base_
patch16_224
0.817 0.957 6.13 14.87 28.50 16.85 86.42 Download link Download link
DeiT_base_
patch16_384
0.830 0.962 14.12 48.80 97.60 49.35 86.42 Download link Download link
DeiT_tiny_
distilled_patch16_224
0.741 0.918 3.51 4.05 6.03 1.08 5.87 Download link Download link
DeiT_small_
distilled_patch16_224
0.809 0.953 3.70 6.20 10.53 4.26 22.36 Download link Download link
DeiT_base_
distilled_patch16_224
0.831 0.964 6.17 14.94 28.58 16.93 87.18 Download link Download link
DeiT_base_
distilled_patch16_384
0.851 0.973 14.12 48.76 97.09 49.43 87.18 Download link Download link

13. RepVGG series [36]

The accuracy and speed indicators of RepVGG series models are shown in the following table. For more introduction, please refer to: RepVGG series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
RepVGG_A0 0.7131 0.9016 1.36 8.31 Download link Download link
RepVGG_A1 0.7380 0.9146 2.37 12.79 Download link Download link
RepVGG_A2 0.7571 0.9264 5.12 25.50 Download link Download link
RepVGG_B0 0.7450 0.9213 3.06 14.34 Download link Download link
RepVGG_B1 0.7773 0.9385 11.82 51.83 Download link Download link
RepVGG_B2 0.7813 0.9410 18.38 80.32 Download link Download link
RepVGG_B1g2 0.7732 0.9359 8.82 41.36 Download link Download link
RepVGG_B1g4 0.7675 0.9335 7.31 36.13 Download link Download link
RepVGG_B2g4 0.7881 0.9448 11.34 55.78 Download link Download link
RepVGG_B3g4 0.7965 0.9485 16.07 75.63 Download link Download link

14. MixNet series [29]

The accuracy and speed indicators of the MixNet series models are shown in the following table. For more introduction, please refer to: MixNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M) Params(M) Pretrained Model Download Address Inference Model Download Address
MixNet_S 0.7628 0.9299 2.31 3.63 5.20 252.977 4.167 Download link Download link
MixNet_M 0.7767 0.9364 2.84 4.60 6.62 357.119 5.065 Download link Download link
MixNet_L 0.7860 0.9437 3.16 5.55 8.03 579.017 7.384 Download link Download link

15. ReXNet series [30]

The accuracy and speed indicators of ReXNet series models are shown in the following table. For more introduction, please refer to: ReXNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
ReXNet_1_0 0.7746 0.9370 3.08 4.15 5.49 0.415 4.84 Download link Download link
ReXNet_1_3 0.7913 0.9464 3.54 4.87 6.54 0.68 7.61 Download link Download link
ReXNet_1_5 0.8006 0.9512 3.68 5.31 7.38 0.90 9.79 Download link Download link
ReXNet_2_0 0.8122 0.9536 4.30 6.54 9.19 1.56 16.45 Download link Download link
ReXNet_3_0 0.8209 0.9612 5.74 9.49 13.62 3.44 34.83 Download link Download link

16. SwinTransformer series [27]

The accuracy and speed indicators of SwinTransformer series models are shown in the following table. For more introduction, please refer to: SwinTransformer series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
SwinTransformer_tiny_patch4_window7_224 0.8069 0.9534 6.59 9.68 16.32 4.35 28.26 Download link Download link
SwinTransformer_small_patch4_window7_224 0.8275 0.9613 12.54 17.07 28.08 8.51 49.56 Download link Download link
SwinTransformer_base_patch4_window7_224 0.8300 0.9626 13.37 23.53 39.11 15.13 87.70 Download link Download link
SwinTransformer_base_patch4_window12_384 0.8439 0.9693 19.52 64.56 123.30 44.45 87.70 Download link Download link
SwinTransformer_base_patch4_window7_224[1] 0.8487 0.9746 13.53 23.46 39.13 15.13 87.70 Download link Download link
SwinTransformer_base_patch4_window12_384[1] 0.8642 0.9807 19.65 64.72 123.42 44.45 87.70 Download link Download link
SwinTransformer_large_patch4_window7_224[1] 0.8596 0.9783 15.74 38.57 71.49 34.02 196.43 Download link Download link
SwinTransformer_large_patch4_window12_384[1] 0.8719 0.9823 32.61 116.59 223.23 99.97 196.43 Download link Download link

[1]:It is pre-trained based on the ImageNet22k dataset, and then transferred and learned from the ImageNet1k dataset.

17. LeViT series [33]

The accuracy and speed indicators of LeViT series models are shown in the following table. For more introduction, please refer to: LeViT series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M) Params(M) Pretrained Model Download Address Inference Model Download Address
LeViT_128S 0.7598 0.9269 281 7.42 Download link Download link
LeViT_128 0.7810 0.9371 365 8.87 Download link Download link
LeViT_192 0.7934 0.9446 597 10.61 Download link Download link
LeViT_256 0.8085 0.9497 1049 18.45 Download link Download link
LeViT_384 0.8191 0.9551 2234 38.45 Download link Download link

Note: The accuracy difference with Reference is due to the difference in data preprocessing and the use of no distilled head as output.

18. Twins series [34]

The accuracy and speed indicators of Twins series models are shown in the following table. For more introduction, please refer to: Twins series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
pcpvt_small 0.8082 0.9552 7.32 10.51 15.27 3.67 24.06 Download link Download link
pcpvt_base 0.8242 0.9619 12.20 16.22 23.16 6.44 43.83 Download link Download link
pcpvt_large 0.8273 0.9650 16.47 22.90 32.73 9.50 60.99 Download link Download link
alt_gvt_small 0.8140 0.9546 6.94 9.01 12.27 2.81 24.06 Download link Download link
alt_gvt_base 0.8294 0.9621 9.37 15.02 24.54 8.34 56.07 Download link Download link
alt_gvt_large 0.8331 0.9642 11.76 22.08 35.12 14.81 99.27 Download link Download link

Note: The accuracy difference with Reference is due to the difference in data preprocessing.

19. HarDNet series [37]

The accuracy and speed indicators of HarDNet series models are shown in the following table. For more introduction, please refer to: HarDNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
HarDNet39_ds 0.7133 0.8998 1.40 2.30 3.33 0.44 3.51 Download link Download link
HarDNet68_ds 0.7362 0.9152 2.26 3.34 5.06 0.79 4.20 Download link Download link
HarDNet68 0.7546 0.9265 3.58 8.53 11.58 4.26 17.58 Download link Download link
HarDNet85 0.7744 0.9355 6.24 14.85 20.57 9.09 36.69 Download link Download link

20. DLA series [38]

The accuracy and speed indicators of DLA series models are shown in the following table. For more introduction, please refer to: DLA series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
DLA102 0.7893 0.9452 4.95 8.08 12.40 7.19 33.34 Download link Download link
DLA102x2 0.7885 0.9445 19.58 23.97 31.37 9.34 41.42 Download link Download link
DLA102x 0.781 0.9400 11.12 15.60 20.37 5.89 26.40 Download link Download link
DLA169 0.7809 0.9409 7.70 12.25 18.90 11.59 53.50 Download link Download link
DLA34 0.7603 0.9298 1.83 3.37 5.98 3.07 15.76 Download link Download link
DLA46_c 0.6321 0.853 1.06 2.08 3.23 0.54 1.31 Download link Download link
DLA60 0.7610 0.9292 2.78 5.36 8.29 4.26 22.08 Download link Download link
DLA60x_c 0.6645 0.8754 1.79 3.68 5.19 0.59 1.33 Download link Download link
DLA60x 0.7753 0.9378 5.98 9.24 12.52 3.54 17.41 Download link Download link

21. RedNet series [39]

The accuracy and speed indicators of RedNet series models are shown in the following table. For more introduction, please refer to: RedNet series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
RedNet26 0.7595 0.9319 4.45 15.16 29.03 1.69 9.26 Download link Download link
RedNet38 0.7747 0.9356 6.24 21.39 41.26 2.14 12.43 Download link Download link
RedNet50 0.7833 0.9417 8.04 27.71 53.73 2.61 15.60 Download link Download link
RedNet101 0.7894 0.9436 13.07 44.12 83.28 4.59 25.76 Download link Download link
RedNet152 0.7917 0.9440 18.66 63.27 119.48 6.57 34.14 Download link Download link

22. TNT series [35]

The accuracy and speed indicators of TNT series models are shown in the following table. For more introduction, please refer to: TNT series model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
TNT_small 0.8121 0.9563 4.83 23.68 Download link Download link

Note: Both mean and std in the data preprocessing part of the TNT model NormalizeImage are 0.5.

23. CSWinTransformer series [40]

The accuracy and speed indicators of CSWinTransformer series models are shown in the following table. For more introduction, please refer to: CSWinTransformer series model documents

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
CSWinTransformer_tiny_224 0.8281 0.9628 - - - 4.1 22 Download link Download link
CSWinTransformer_small_224 0.8358 0.9658 - - - 6.4 35 Download link Download link
CSWinTransformer_base_224 0.8420 0.9692 - - - 14.3 77 Download link Download link
CSWinTransformer_large_224 0.8643 0.9799 - - - 32.2 173.3 Download link Download link
CSWinTransformer_base_384 0.8550 0.9749 - - - 42.2 77 Download link Download link
CSWinTransformer_large_384 0.8748 0.9833 - - - 94.7 173.3 Download link Download link

24. PVTV2 series [41]

The accuracy and speed indicators of PVTV2 series models are shown in the following table. For more introduction, please refer to: PVTV2 series model documents

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
PVT_V2_B0 0.705 0.902 - - - 0.53 3.7 Download link Download link
PVT_V2_B1 0.787 0.945 - - - 2.0 14.0 Download link Download link
PVT_V2_B2 0.821 0.960 - - - 3.9 25.4 Download link Download link
PVT_V2_B2_Linear 0.821 0.961 - - - 3.8 22.6 Download link Download link
PVT_V2_B3 0.831 0.965 - - - 6.7 45.2 Download link Download link
PVT_V2_B4 0.836 0.967 - - - 9.8 62.6 Download link Download link
PVT_V2_B5 0.837 0.966 - - - 11.4 82.0 Download link Download link

25. MobileViT series [42]

The accuracy and speed indicators of MobileViT series models are shown in the following table. For more introduction, please refer to:MobileViT series model documents

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M) Params(M) Pretrained Model Download Address Inference Model Download Address
MobileViT_XXS 0.6867 0.8878 - - - 1849.35 5.59 Download link Download link
MobileViT_XS 0.7454 0.9227 - - - 930.75 2.33 Download link Download link
MobileViT_S 0.7814 0.9413 - - - 337.24 1.28 Download link Download link

26. Other models

The accuracy and speed indicators of AlexNet [18], SqueezeNet series [19], VGG series [20], DarkNet53 [21] and other models are shown in the following table. For more information, please refer to: Other model documents.

Model Top-1 Acc Top-5 Acc time(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G) Params(M) Pretrained Model Download Address Inference Model Download Address
AlexNet 0.567 0.792 0.81 1.50 2.33 0.71 61.10 Download link Download link
SqueezeNet1_0 0.596 0.817 0.68 1.64 2.62 0.78 1.25 Download link Download link
SqueezeNet1_1 0.601 0.819 0.62 1.30 2.09 0.35 1.24 Download link Download link
VGG11 0.693 0.891 1.72 4.15 7.24 7.61 132.86 Download link Download link
VGG13 0.700 0.894 2.02 5.28 9.54 11.31 133.05 Download link Download link
VGG16 0.720 0.907 2.48 6.79 12.33 15.470 138.35 Download link Download link
VGG19 0.726 0.909 2.93 8.28 15.21 19.63 143.66 Download link Download link
DarkNet53 0.780 0.941 2.79 6.42 10.89 9.31 41.65 Download link Download link

Reference

[1] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[2] He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 558-567.

[3] Howard A, Sandler M, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1314-1324.

[4] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520.

[5] Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.

[6] Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131.

[7] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500.

[8] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[9] Gao S, Cheng M M, Zhao K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2019.

[10] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.

[11] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C]//Thirty-first AAAI conference on artificial intelligence. 2017.

[12] Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.

[13] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. arXiv preprint arXiv:1908.07919, 2019.

[14] Chen Y, Li J, Xiao H, et al. Dual path networks[C]//Advances in neural information processing systems. 2017: 4467-4475.

[15] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.

[16] Tan M, Le Q V. Efficientnet: Rethinking model scaling for convolutional neural networks[J]. arXiv preprint arXiv:1905.11946, 2019.

[17] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196.

[18] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105.

[19] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size[J]. arXiv preprint arXiv:1602.07360, 2016.

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[21] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[22] Ding X, Guo Y, Ding G, et al. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1911-1920.

[23] Han K, Wang Y, Tian Q, et al. GhostNet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1580-1589.

[24] Zhang H, Wu C, Zhang Z, et al. Resnest: Split-attention networks[J]. arXiv preprint arXiv:2004.08955, 2020.

[25] Radosavovic I, Kosaraju R P, Girshick R, et al. Designing network design spaces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10428-10436.

[26] C.Szegedy, V.Vanhoucke, S.Ioffe, J.Shlens, and Z.Wojna. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567, 2015.

[27] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

[28]Cheng Cui, Tingquan Gao, Shengyu Wei, Yuning Du, Ruoyu Guo, Shuilong Dong, Bin Lu, Ying Zhou, Xueying Lv, Qiwen Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma. PP-LCNet: A Lightweight CPU Convolutional Neural Network.

[29]Mingxing Tan, Quoc V. Le. MixConv: Mixed Depthwise Convolutional Kernels.

[30]Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo. Rethinking Channel Dimensions for Efficient Model Design.

[31]Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE.

[32]Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Herve Jegou. Training data-efficient image transformers & distillation through attention.

[33]Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herve Jegou, Matthijs Douze. LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference.

[34]Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen. Twins: Revisiting the Design of Spatial Attention in Vision Transformers.

[35]Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang. Transformer in Transformer.

[36]Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun. RepVGG: Making VGG-style ConvNets Great Again.

[37]Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin. HarDNet: A Low Memory Traffic Network.

[38]Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell. Deep Layer Aggregation.

[39]Duo Lim Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen. Involution: Inverting the Inherence of Convolution for Visual Recognition.

[40]Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows.

[41]Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao. PVTv2: Improved Baselines with Pyramid Vision Transformer.

[42]Sachin Mehta, Mohammad Rastegari. MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer.