Skip to content

Latest commit

 

History

History
161 lines (127 loc) · 14.7 KB

detection_model_zoo.md

File metadata and controls

161 lines (127 loc) · 14.7 KB

Tensorflow detection model zoo

We provide a collection of detection models pre-trained on the COCO dataset, the Kitti dataset, the Open Images dataset, the AVA v2.1 dataset and the iNaturalist Species Detection Dataset. These models can be useful for out-of-the-box inference if you are interested in categories already in those datasets. They are also useful for initializing your models when training on novel datasets.

In the table below, we list each such pre-trained model including:

  • a model name that corresponds to a config file that was used to train this model in the samples/configs directory,
  • a download link to a tar.gz file containing the pre-trained model,
  • model speed --- we report running time in ms per 600x600 image (including all pre and post-processing), but please be aware that these timings depend highly on one's specific hardware configuration (these timings were performed using an Nvidia GeForce GTX TITAN X card) and should be treated more as relative timings in many cases. Also note that desktop GPU timing does not always reflect mobile run time. For example Mobilenet V2 is faster on mobile devices than Mobilenet V1, but is slightly slower on desktop GPU.
  • detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure. Here, higher is better, and we only report bounding box mAP rounded to the nearest integer.
  • Output types (Boxes, and Masks if applicable )

You can un-tar each tar.gz file via, e.g.,:

tar -xzvf ssd_mobilenet_v1_coco.tar.gz

Inside the un-tar'ed directory, you will find:

  • a graph proto (graph.pbtxt)
  • a checkpoint (model.ckpt.data-00000-of-00001, model.ckpt.index, model.ckpt.meta)
  • a frozen graph proto with weights baked into the graph as constants (frozen_inference_graph.pb) to be used for out of the box inference (try this out in the Jupyter notebook!)
  • a config file (pipeline.config) which was used to generate the graph. These directly correspond to a config file in the samples/configs) directory but often with a modified score threshold. In the case of the heavier Faster R-CNN models, we also provide a version of the model that uses a highly reduced number of proposals for speed.
  • Mobile model only: a TfLite file (model.tflite) that can be deployed on mobile devices.

Some remarks on frozen inference graphs:

  • If you try to evaluate the frozen graph, you may find performance numbers for some of the models to be slightly lower than what we report in the below tables. This is because we discard detections with scores below a threshold (typically 0.3) when creating the frozen graph. This corresponds effectively to picking a point on the precision recall curve of a detector (and discarding the part past that point), which negatively impacts standard mAP metrics.
  • Our frozen inference graphs are generated using the v1.12.0 release version of Tensorflow and we do not guarantee that these will work with other versions; this being said, each frozen inference graph can be regenerated using your current version of Tensorflow by re-running the exporter, pointing it at the model directory as well as the corresponding config file in samples/configs.

COCO-trained models

Model name Speed (ms) COCO mAP1 Outputs
ssd_mobilenet_v1_coco 30 21 Boxes
ssd_mobilenet_v1_0.75_depth_coco ☆ 26 18 Boxes
ssd_mobilenet_v1_quantized_coco ☆ 29 18 Boxes
ssd_mobilenet_v1_0.75_depth_quantized_coco ☆ 29 16 Boxes
ssd_mobilenet_v1_ppn_coco ☆ 26 20 Boxes
ssd_mobilenet_v1_fpn_coco ☆ 56 32 Boxes
ssd_resnet_50_fpn_coco ☆ 76 35 Boxes
ssd_mobilenet_v2_coco 31 22 Boxes
ssd_mobilenet_v2_quantized_coco 29 22 Boxes
ssdlite_mobilenet_v2_coco 27 22 Boxes
ssd_inception_v2_coco 42 24 Boxes
faster_rcnn_inception_v2_coco 58 28 Boxes
faster_rcnn_resnet50_coco 89 30 Boxes
faster_rcnn_resnet50_lowproposals_coco 64 Boxes
rfcn_resnet101_coco 92 30 Boxes
faster_rcnn_resnet101_coco 106 32 Boxes
faster_rcnn_resnet101_lowproposals_coco 82 Boxes
faster_rcnn_inception_resnet_v2_atrous_coco 620 37 Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco 241 Boxes
faster_rcnn_nas 1833 43 Boxes
faster_rcnn_nas_lowproposals_coco 540 Boxes
mask_rcnn_inception_resnet_v2_atrous_coco 771 36 Masks
mask_rcnn_inception_v2_coco 79 25 Masks
mask_rcnn_resnet101_atrous_coco 470 33 Masks
mask_rcnn_resnet50_atrous_coco 343 29 Masks

Note: The asterisk (☆) at the end of model name indicates that this model supports TPU training.

Note: If you download the tar.gz file of quantized models and un-tar, you will get different set of files - a checkpoint, a config file and tflite frozen graphs (txt/binary).

Mobile models

Model name Pixel 1 Latency (ms) COCO mAP Outputs
ssd_mobilenet_v3_large_coco 119 22.3 Boxes
ssd_mobilenet_v3_small_coco 43 15.6 Boxes

Pixel4 Edge TPU models

Model name Pixel 4 Edge TPU Latency (ms) COCO mAP Outputs
ssd_mobilenet_edgetpu_coco 6.6 24.3 Boxes

Kitti-trained models

Model name Speed (ms) Pascal mAP@0.5 Outputs
faster_rcnn_resnet101_kitti 79 87 Boxes

Open Images-trained models

Model name Speed (ms) Open Images mAP@0.52 Outputs
faster_rcnn_inception_resnet_v2_atrous_oidv2 727 37 Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2 347 Boxes
facessd_mobilenet_v2_quantized_open_image_v4 3 20 73 (faces) Boxes
Model name Speed (ms) Open Images mAP@0.54 Outputs
faster_rcnn_inception_resnet_v2_atrous_oidv4 425 54 Boxes
ssd_mobilenetv2_oidv4 89 36 Boxes
ssd_resnet_101_fpn_oidv4 237 38 Boxes

iNaturalist Species-trained models

Model name Speed (ms) Pascal mAP@0.5 Outputs
faster_rcnn_resnet101_fgvc 395 58 Boxes
faster_rcnn_resnet50_fgvc 366 55 Boxes

AVA v2.1 trained models

Model name Speed (ms) Pascal mAP@0.5 Outputs
faster_rcnn_resnet101_ava_v2.1 93 11 Boxes

Footnotes

  1. See MSCOCO evaluation protocol. The COCO mAP numbers here are evaluated on COCO 14 minival set (note that our split is different from COCO 17 Val). A full list of image ids used in our split could be fould here.

  2. This is PASCAL mAP with a slightly different way of true positives computation: see Open Images evaluation protocols, oid_V2_detection_metrics.

  3. Non-face boxes are dropped during training and non-face groundtruth boxes are ignored when evaluating.

  4. This is Open Images Challenge metric: see Open Images evaluation protocols, oid_challenge_detection_metrics.