Skip to content

Latest commit

 

History

History
68 lines (53 loc) · 5.17 KB

README_seg.md

File metadata and controls

68 lines (53 loc) · 5.17 KB

Semantic Segmentation with tkDNN

Currently tkDNN supports only ShelfNet as semantic segmentation network.

Run the demo

To run the semantic segmentation demo follow these steps (example with shelfnet):

rm shelfnet_fp32.rt        # be sure to delete(or move) old tensorRT files
export TKDNN_BATCHSIZE=4   # be sure you have batch size > than 1 if you want to run inference on images bigger than 1024
./test_shelfnet            # run the yolo test (is slow)
./demo shelfnet_fp32.rt ../demo/yolo_test.mp4 1 19

In general the demo program takes the following parameters:

./seg_demo <network-rt-file> <path-to-video> <n-batches> <number-of-classes> <resize-flag> <baseline-resize> <show-flag> <write-pred>

where

  • <network-rt-file> is the rt file generated by a test
  • <<path-to-video> is the path to a video file or a camera input
  • <n-batches> number of batches to use in inference (N.B. you should first export TKDNN_BATCHSIZE to the required n_batches and create again the rt file for the network).
  • <number-of-classes>is the number of classes the network is trained on
  • <resize-flag> if set to 0 the demo will not resize the input frames, but use it as it is, otherwise it will resize it.
  • <baseline-resize> is <resize-flag> is set to 1, then the input frames will be proportionally resized using <baseline-resize> as width baseline.
  • <show-flag> if set to 0 the demo will not show the visualization but save the video into result.mp4 (if n-batches ==1)
  • <write-pred> if set to 0 (default) the demo will run, otherwise the evaluation of a dataset will run and the output of the segmentation will be saved. Attention: this is under development and paths are embedded, so change them in the code in advance.

NB) By default it is used FP32 inference NB) The batching is not used to work on more streams, rather to work on more tiles of the same image. Shelfnet never resized the input image, therefore for images greater than 1024x1024 tiles of 1024x1024 are given in input to the network in batch.

demo

For other demo videos refer to this playlist.

NB) The gif and the videos are obtained with Mapillary Vistas weights, that we cannot publicly share due to its license restrictions. However, you can train Shelfnet using Mapillary and this fork of the original repo.

FPS Results

Inference FPS of shelfnet with tkDNN, average of 1200 images on:

  • RTX 2080Ti (CUDA 10.2, TensorRT 7.0.0, Cudnn 7.6.5);
  • Xavier AGX, Jetpack 4.3 (CUDA 10.0, CUDNN 7.6.3, tensorrt 6.0.1 );
Platform Test Phase FP32, ms FP32, FPS FP16, ms FP16, FPS INT8, ms INT8, FPS
RTX 2080Ti shelfnet 1024x1024 (B=1) pre 6.11863 163.435 5.81465 171.979 5.88699 169.866
RTX 2080Ti shelfnet 1024x1024 (B=1) inf 11.5464 86.6074 7.35396 135.981 6.37623 156.832
RTX 2080Ti shelfnet 1024x1024 (B=1) post 4.09058 244.464 3.91961 255.128 4.07343 245.493
RTX 2080Ti shelfnet 1024x1024 (B=1) tot 21.7556 45.9652 17.0882 58.5199 16.3366 61.2121
RTX 2080Ti shelfnet 2048x2048 (B=4) pre 25.435 39.3158 25.2953 39.5331 25.9303 38.565
RTX 2080Ti shelfnet 2048x2048 (B=4) inf 36.5015 27.3961 17.0534 58.6395 15.6061 64.0773
RTX 2080Ti shelfnet 2048x2048 (B=4) post 17.3917 57.4985 17.1649 58.2583 17.5539 56.9675
RTX 2080Ti shelfnet 2048x2048 (B=4) tot 79.3283 12.6058 59.5136 16.8029 59.0903 16.9233
AGX Xavier shelfnet 1024x1024 (B=1) pre 8.0174 124.729 7.5117 133.126 7.47333 133.809
AGX Xavier shelfnet 1024x1024 (B=1) inf 72.4173 13.8089 37.505 26.6631 31.3286 31.9197
AGX Xavier shelfnet 1024x1024 (B=1) post 8.89958 112.365 8.83576 113.176 9.42655 106.083
AGX Xavier shelfnet 1024x1024 (B=1) tot 89.3342 11.1939 53.8525 18.5692 48.2285 20.7346
AGX Xavier shelfnet 2048x2048 (B=4) pre 47.1454 21.211 21.6475 46.1947 21.4201 46.6851
AGX Xavier shelfnet 2048x2048 (B=4) inf 266.537 3.75183 128.321 7.79293 107.621 9.29185
AGX Xavier shelfnet 2048x2048 (B=4) post 44.0711 22.6906 40.1732 24.8922 39.873 25.0796
AGX Xavier shelfnet 2048x2048 (B=4) tot 357.753 2.79522 190.142 5.25922 168.914 5.92016

Known issues

When creating the rt file all the checks returns errors. It is due to a different resize function and handling of the original ShelfNet outputs. However, the network is supposed to work.