Tired of long inference times with your favourite yolov8 models?
Then this library is for you! Run yolov8's classification, detection, pose and segmentation models as engines by using Nvidia's tensorrt. Seamlessly obtain results or even draw the result overlay on top of the image with just a couple of lines of code.
Example result overlay for detection, pose and segmentation (image source) |
Example result of the masks obtained from the segmentation model for each object (see ex_draw_seg_mask.py) |
Example result overlay for YoloV8.1 with OBB (image source) |
This library makes use of Nvidia's specific features, therefore a Nvidia GPU, a working installation of tensorrt (Link) and the ONNX files of the model/s are required.
The easiest way to install/use the library is by using the Nvidia's TensorRT docker image available at Nvidia NGC (Link, the library has been tested on the tensorrt:24.01-py3 image).
-
If not already done, configure docker to use the Nvidia runtime as default by editing the file in
/etc/docker/daemon.json
, add to the first line (Link)"default-runtime": "nvidia"
-
Copy the content of
Dockerfile
file and build the image with the command:$ docker build -t image:tag .
-
Run the image with the command:
$ docker run -it image:tag
-
Clone the repository to your local machine and run the following commands:
$ git clone https://github.com/Armaggheddon/tensorrt_yolov8.git $ cd tensorrt_yolov8 $ pip install -U .
-
Alternatively, directly get the library with pip with the command:
$ pip install git+https://github.com/Armaggheddon/tensorrt_yolov8.git
-
Uninstall the library with:
$ pip uninstall tensorrt_yolov8
-
Obtain the ONNX file of the desired yolo model. This can be easily done by using Ultralytics library (Link). For example the following commands installs, downloads and exports the yolov8s detection model in the current path (See Link for a list of available model types):
$ pip install ultralytics $ yolo export model=yolov8s.pt format=onnx
-
Convert the ONNX model to an Nvidia engine. This can be done using the utility
trtexec
(generally located at/usr/src/tensorrt/bin/trtexec
) or by using the utility function available in this library with:from tensorrt_yolov8.utils import engine_builder engine_builder.build_engine_from_onnx( "path/to/onnx/model.onnx", "path/to/created/model.engine" )
This by default exports yolov8s using FP32 and with batch size=1. This operation is required only the first time. The same model engine can then be used multiple times. If the tensorrt version on which the model has been built is different from the one used to run the engine, the library will complain about this. Fix the issue using the above piece of code.
-
Run the exported model and perform inference with
import cv2 from tensorrt_yolov8 import Pipeline detection = Pipeline("detection", "path/to/model.engine") img = cv2.imread("img.jpg") results = detection(img, min_prob=0.5) img_result = detection.draw_results(img, results) cv2.imwrite("result.jpg", img_result)
-
For additional examples see the files in Examples
Warning
Model export is still work in progress 🚧. Currently the batch size setting does not work correctly and the model is exported using only the minimum batch size
- Support for different (static) batch sizes
- Support for Yolo 8.1 OBB (Oriented Bounding Box)
- Support for dynamic batch sizes