This repository provides C++ and C examples that use TensorRT to inference the models that are implement with Pytorch/JAX/Tensorflow.
- Python 3.x | TensorRT | PyTorch | Tensorflow | ONNX | tf2onnx | JAX | CUDA Toolkit
-
Clone the repository:
git clone https://github.com/ggluo/TensorRT-Cpp-Example.git cd TensorRT-Cpp-Example
-
Install onnx and tf2onnx if not:
pip install onnx onnxscript tf2onnx
-
Ensure that TensorRT and CUDA Toolkit are installed on your system and specify it according in the makefile.
LDFLAGS = -L/path/to/TensorRT/lib INCLUDEDIRS = -I/path/to/TensorRT/include
To run the test script, execute the following commands:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/TensorRT/lib
bash run_all.sh
The script run_all.sh
performs the following steps:
- Exports the ONNX model:
python python/export_model.py data/model.onnx
- Compiles the TensorRT inference code:
make
- Runs the TensorRT inference code:
./main data/model.onnx data/first_engine.trt
The provided ONNX model is located at data/model.onnx
, and the resulting TensorRT engine will be saved to data/first_engine.trt
.
The main.cpp
file contains the main entry point for the TensorRT inference code. Below is an overview of its functionality:
#include "trt.h"
#include <iostream>
int main(int argc, char** argv) {
std::cout << "Hello World from TensorRT" << std::endl;
// Parse command-line arguments
infer_params params{argv[1], 1, argv[2], ""};
// Initialize TensorRT inference object
trt_infer trt(params);
trt.build();
// Copy input data from host to device
trt.CopyFromHostToDevice({0.5f, -0.5f, 1.0f}, 0, nullptr);
// Perform inference
trt.infer();
// Copy output data from device to host
std::vector<float> output(2, 0.0f);
trt.CopyFromDeviceToHost(output, 1, nullptr);
// Print output
std::cout << "Output: " << output[0] << ", " << output[1] << std::endl;
return 0;
}
This code performs the following steps:
- Initializes the TensorRT inference parameters using command-line arguments.
- Initializes the TensorRT inference object and builds the inference engine.
- Copies input data from the host to the device.
- Performs inference.
- Copies output data from the device to the host.
- Prints the output.
The folder python
includes scripts for exporting models, which are created with Pytorch/JAX/Tensorflow, to onnx
format. You can customize them for exporting your own models if needed.
- memory leakage check with valgrind
- add c_connector
- load engine from file
- gaussian blur