-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Benchmark
This page contains the benchmark results for several popular image classification models. We auto-tune all listed models on target platforms and benchmark the inference performance (time cost per image).
- Results
- Links
Note: If a board has big.LITTLE architecture, we will use all big cores. Otherwise, we will use all cores. In the following device specifications, we only list the cores being used.
- Firefly-RK3399 : 2 x Cortex A72 1.8Ghz
- Raspberry Pi 3B : 4 x Cortex A53 1.2Ghz
- Huawei P20 Pro / Mate10 Pro (Soc: HiSilicon Kirin 970) : (4 x Cortex A73 2.36GHz)
- Google Pixel 2 (Soc: Qualcomm Snapdragon 835) : (4 × Kyro 2.35 GHz)
- PYNQ (2 x Cortex-A9 650MHz)
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Raspberry Pi 3B | 610.2 | 2074.2 | 121.8 | 104.8 | 320.0 | 726.0 | 185.1 | 94.0 | 1772.0 | 2119.8 |
Firefly RK3399 | 336.8 | 1304.4 | 77.9 | 64.8 | 158.6 | 403.2 | 94.3 | 48.2 | 903.5 | 1086.0 |
Huawei P20 Pro | 179.7 | 444.7 | 41.3 | 33.4 | 77.4 | 232.5 | 51.4 | 26.0 | 486.3 | 729.4 |
Google Pixel2 | 161.0 | 434.8 | 39.6 | 29.3 | 66.0 | 181.1 | 47.3 | 23.0 | 397.1 | 485.0 |
Xilinx PYNQ | 2887.0 | 9691.7 | 721.4 | 513.3 | 1231.7 | 3585.5 | 913.0 | 478.3 | -1.0 | -1.0 |
- Mali-T860 MP4: On Firefly-RK3399. Its frequency is locked to 800MHz.
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Mali-T860 | 410.6 | 784.7 | 79.5 | 77.7 | 127.3 | 354.7 | 111.0 | 62.5 | 673.2 | 792.1 |
- dtype = float16 and batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | squeezenet-v1.0 | squeezenet-v1.1 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|---|---|
Mali-T860 | 295.4 | 464.9 | 52.9 | 60.7 | 84.3 | 221.0 | 77.3 | 46.7 | 405.6 | 472.8 |
- Jetson TX2: on Max-N mode 1.3GHz
- GTX 1080 TI, GTX Titan X
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | mobilenet-v2 | resnet-18 | resnet-50 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|---|
GTX 1080 Ti | 3.6 | 5.8 | 0.7 | 1.0 | 1.1 | 2.8 | 4.2 | 4.8 |
GTX TITAN X | 5.8 | 9.9 | 1.0 | 1.6 | 1.6 | 4.3 | 6.3 | 7.4 |
Jetson TX2 | 26.8 | 45.7 | 5.2 | 8.8 | 9.6 | 26.2 | 58.2 | 68.8 |
- dtype = float32, batch_size = 1 (unit: ms)
densenet-121 | inception-v3 | mobilenet | resnet-18 | resnet-50 | vgg-16 | vgg-19 | |
---|---|---|---|---|---|---|---|
Vega FE | 5.8 | 8.9 | 1.0 | 1.6 | 4.5 | 6.3 | 7.2 |
TVM supports Adreno hardware in both Native OpenCL path as well as using OpenCLML BYOC path.
OpenCLML is Qualcomm's propriety acceleration operator library implemented as an extension. OpenCLML SDK is available developer community.More details about OpenCLML can be found at Qualcomm Developer Network OpenCLML and OpenCLML with TVM
OpenCLML is integrated into TVM as a BYOC backend which can accelerate operators using Qualcomm's hardware aware proprietary operators.
- Snapdragon Gen 1 : Adreno 730
- batch_size = 1 (unit: ms)
Resnet 18 | Resnet 34 | Resnet 50 | VGG-16 | VGG-19 | Densenet-121 | Inception V3 | MobilenetV1 | Squeezenet-v1.0 | Squeezenet-v1.1 | |
---|---|---|---|---|---|---|---|---|---|---|
FP32 | 9.56 | 15.37 | 18.25 | 54.20 | 108.71 | 27.33 | 39.54 | 3.82 | 6.89 | 3.24 |
FP16 | 6.94 | 11.94 | 13.77 | 34.58 | 41.23 | 11.93 | 30.13 | 2.72 | 4.75 | 2.52 |
- batch_size = 1 (unit: ms)
Resnet 18 | Resnet 34 | Resnet 50 | Densenet-121 | Inception V3 | MobilenetV1 | Squeezenet-v1.0 | Squeezenet-v1.1 | |
---|---|---|---|---|---|---|---|---|
FP32 | 9.75 | 15.22 | 25.43 | 15.56 | 26.63 | 3.85 | 8.90 | 2.79 |
FP16 | 4.52 | 7.34 | 13.17 | 7.87 | 12.44 | 1.54 | 3.28 | 1.31 |
See readme page https://github.com/dmlc/tvm/tree/master/apps/benchmark on how to get these numbers.