TensorRT C/C++ Inference

This repository provides C++ and C examples that use TensorRT to inference the models that are implement with Pytorch/JAX/Tensorflow.

Requirements

Python 3.x | TensorRT | PyTorch | Tensorflow | ONNX | tf2onnx | JAX | CUDA Toolkit

Setup

Clone the repository:

git clone https://github.com/ggluo/TensorRT-Cpp-Example.git
cd TensorRT-Cpp-Example

Install onnx and tf2onnx if not:
```
pip install onnx onnxscript tf2onnx
```
Ensure that TensorRT and CUDA Toolkit are installed on your system and specify it according in the makefile.
```
LDFLAGS = -L/path/to/TensorRT/lib
INCLUDEDIRS = -I/path/to/TensorRT/include
```

Running all tests

To run the test script, execute the following commands:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/TensorRT/lib
bash run_all.sh

The script run_all.sh performs the following steps:

Exports the ONNX model: python python/export_model.py data/model.onnx
Compiles the TensorRT inference code: make
Runs the TensorRT inference code: ./main data/model.onnx data/first_engine.trt

The provided ONNX model is located at data/model.onnx, and the resulting TensorRT engine will be saved to data/first_engine.trt.

Overview of `main.cpp`

The main.cpp file contains the main entry point for the TensorRT inference code. Below is an overview of its functionality:

#include "trt.h"
#include <iostream>

int main(int argc, char** argv) {
    std::cout << "Hello World from TensorRT" << std::endl;

    // Parse command-line arguments
    infer_params params{argv[1], 1,  argv[2], ""}; 

    // Initialize TensorRT inference object
    trt_infer trt(params);
    trt.build();

    // Copy input data from host to device
    trt.CopyFromHostToDevice({0.5f, -0.5f, 1.0f}, 0, nullptr);

    // Perform inference
    trt.infer();

    // Copy output data from device to host
    std::vector<float> output(2, 0.0f);
    trt.CopyFromDeviceToHost(output, 1, nullptr);

    // Print output
    std::cout << "Output: " << output[0] << ", " << output[1] << std::endl;

    return 0;
}

This code performs the following steps:

Initializes the TensorRT inference parameters using command-line arguments.
Initializes the TensorRT inference object and builds the inference engine.
Copies input data from the host to the device.
Performs inference.
Copies output data from the device to the host.
Prints the output.

Export models

The folder python includes scripts for exporting models, which are created with Pytorch/JAX/Tensorflow, to onnx format. You can customize them for exporting your own models if needed.

TODO

memory leakage check with valgrind
add c_connector
load engine from file
gaussian blur

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

TensorRT C/C++ Inference

Requirements

Setup

Running all tests

Overview of `main.cpp`

Export models

TODO

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

TensorRT C/C++ Inference

Requirements

Setup

Running all tests

Overview of main.cpp

Export models

TODO

Overview of `main.cpp`