- First, all reference deep learning models should be installed on Raspberry Pi OS with:
sudo apt install imx500-models
- Then, make sure you are located on the Picamera2 demo/example scripts where the demo applications scripts running the models are located.
git clone https://github.com/raspberrypi/picamera2.git
cd picamera2/examples/imx500
For each example under the reference deep learning models you will have to run an execution command indicating both model and demo application name. Available demo applications are as appears below:
Application Name | Task | Link |
---|---|---|
imx500_classification_demo.py |
Classification | Link |
imx500_object_detection_demo.py |
Object Detection | Link |
imx500_pose_estimation_higherhrnet_demo.py |
Pose Estimation | Link |
imx500_segmentation_demo.py |
Segmentation | Link |
- Task: Categorize input data into predefined classes and provide a confidence score.
- Training dataset:
Imagenet
. Designed for use in visual object recognition research. It contains over 14 million images, making it one of the most extensive resources available for training deep learning models. It comprises 1000 classes.
Model | [Top 1] Accuracy - Quantized(Float) | Input Resolution | Picamera2 Example Script |
---|---|---|---|
EfficientNet-B0 | 72.128 (73.876) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_efficientnet_bo.rpk |
EfficientNet Lite-0 | 75.252 (75.28) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_efficientnet_lite0.rpk |
EfficientNetV2-B0 | 76.674 (76.424) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_efficientnetv2_b0.rpk |
EfficientNetV2-B1 | 77.032 (76.93) | 240x240 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_efficientnetv2_b1.rpk |
EfficientNetV2-B2 | 77.716 (77.94) | 260x260 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_efficientnetv2_b2.rpk |
MnasNet1.0 | 73.16 (73.078) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_mnasnet1.0.rpk |
MobileNetV2 | 71.572 (71.3) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_mobilenet_v2.rpk |
MobileViT-XS | 72.326 (72.412) | 256x256 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_mobilevit_xs.rpk |
MobileViT-XXS | 67.440 (67.40) | 256x256 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_mobilevit_xxs.rpk |
RegNetX-002 | 68.352 (68.20) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_regnetx_002.rpk |
RegNetY-002 | 69.424 (69.60) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_regnety_002.rpk |
RegNetY-004 | 73.830 (73.37) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_regnety_004.rpk |
ResNet-18 | 68.57 (68.546) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_resnet18.rpk |
ShuffleNetV2-x1.5 | 72.194 (72.498) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_shufflenet_v2_x1_5.rpk |
SqueezeNet-V1.0 | 57.598 (57.584) | 224x224 | python imx500_classification_demo.py --model /usr/share/imx500-models/imx500_network_squeezenet1.0.rpk |
- Task: Identify and locate multiple objects within an image by classifying each object.
- Training dataset:
COCO
. Designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It comprises 80 classes.
Model | [mAP] Accuracy - Quantized(Float) | Input Resolution | Picamera2 Example Script |
---|---|---|---|
Efficientdet Lite-0 (pp*) | 0.252 (0.2518) | 320x320 | python imx500_object_detection_demo.py --model /usr/share/imx500-models/imx500_network_efficientdet_lite0_pp.rpk |
Nanodet Plus | 0.332 (0.3316) | 416x416 | python imx500_object_detection_demo.py --model /usr/share/imx500-models/imx500_network_nanodet_plus_416x416.rpk |
Nanodet Plus (pp*) | 0.320 (0.3232) | 416x416 | python imx500_object_detection_demo.py --model /usr/share/imx500-models/imx500_network_nanodet_plus_416x416_pp.rpk |
SSD MobileNetV2 FPN Lite (pp*) | 0.218 (0.219) | 320x320 | python imx500_object_detection_demo.py --model /usr/share/imx500-models/imx500_network_ssd_mobilenetv2_fpnlite_320x320_pp.rpk |
*pp = post-processing is included in the network and is executed on the IMX500 Edge AI Processor
- Task: Assign a category to each pixel in an image, offering a comprehensive analysis of the image's content.
- Training dataset:
PASCAL VOC
. Designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It comprises 20 object categories.
Model | [mIOU] Accuracy - Quantized(Float) | Input Resolution | Picamera2 Example Script |
---|---|---|---|
DeepLabv3Plus | 0.7214 (0.724) | 320x320 | python imx500_segmentation_demo.py --model /usr/share/imx500-models/imx500_network_deeplabv3plus.rpk |
- Task: Detect key points or landmarks on objects or humans in images or videos.
- Training dataset:
COCO (KeyPoints)
Model | [mAP] Accuracy - Quantized(Float) | Input Resolution | Picamera2 Example Script |
---|---|---|---|
HigherHRNet | 0.188 (0.1867) | 288x384 | python imx500_pose_estimation_higherhrnet_demo.py --model /usr/share/imx500-models/imx500_network_higherhrnet_coco.rpk |
Models in this repo are distributed under a number of licenses listed in in the LICENSES directory.