open-mmlab · ZwwWayne · Jan 28, 2021 · Jan 14, 2021 · Jan 20, 2021 · Jan 26, 2021
diff --git a/docs/tensorrt_plugin.md b/docs/tensorrt_plugin.md
@@ -0,0 +1,150 @@
+# TensorRT Plugins for custom operators in MMCV (Experimental)
+
+## Introduction of TensorRT
+
+**NVIDIA TensorRT** is a SDK for high-performance deep learning inference, which includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Check its [developer's website](https://developer.nvidia.com/tensorrt) for more information.
+
+## Why include TensorRT plugins in MMCV
+
+- To ease the deployment of trained models with custom operators from `mmcv.ops` using TensorRT.
+
+## List of TensorRT plugins supported in MMCV
+
+|   ONNX Operator   |    TensorRT Plugin    | Note  |
+| :---------------: | :-------------------: | :---: |
+|     RoiAlign      |     MMCVRoiAlign      |   Y   |
+|     ScatterND     |       ScatterND       |   Y   |
+| NonMaxSuppression | MMCVNonMaxSuppression |  WIP  |
+
+Notes
+
+- All plugins listed above are developed on TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0
+
+## How to build TensorRT plugins in MMCV
+
+### Prerequisite
+
+- Clone repository
+
+```bash
+git clone https://github.com/open-mmlab/mmcv.git
+```
+
+- Install TensorRT
+
+Download the corresponding TensorRT build from [NVIDIA Developer Zone](https://developer.nvidia.com/nvidia-tensorrt-download).
+
+For example, for Ubuntu 16.04 on x86-64 with cuda-10.2, the downloaded file is `TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz`.
+
+Then, install as below:
+
+```bash
+cd ~/Downloads
+tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
+export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib
+```
+
+Install python packages: tensorrt, graphsurgeon, onnx-graphsurgeon
+
+```bash
+pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
+pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
+pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl
+```
+
+For more detailed infomation of installing TensorRT using tar, please refer to [Nvidia' website](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-721/install-guide/index.html#installing-tar).
+
+### Build on Linux
+
+```bash
+cd mmcv # to MMCV root directory
+MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .
+```
+
+## How to create TensorRT engine and run inference in python
+
+Example
+
+```python
+import torch
+import onnx
+
+from mmcv.tensorrt import (TRTWraper, onnx2trt, save_trt_engine,
+                                   is_tensorrt_plugin_loaded)
+
+assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'
+
+onnx_file = 'sample.onnx'
+trt_file = 'sample.trt'
+onnx_model = onnx.load(onnx_file)
+
+# Model input
+inputs = torch.rand(1, 3, 224, 224).cuda()
+# Model input shape info
+opt_shape_dict = {
+    'input': [list(inputs.shape),
+              list(inputs.shape),
+              list(inputs.shape)]
+}
+
+# Create TensorRT engine
+max_workspace_size = 1 << 30
+trt_engine = onnx2trt(
+    onnx_model,
+    opt_shape_dict,
+    max_workspace_size=max_workspace_size)
+
+# Save TensorRT engine
+save_trt_engine(trt_engine, trt_file)
+
+# Run inference with TensorRT
+trt_model = TRTWraper(trt_file, ['input'], ['output'])
+
+with torch.no_grad():
+    trt_outputs = trt_model({'input': inputs})
+    output = trt_outputs['output']
+
+```
+
+## How to add a TensorRT plugin for custom op in MMCV
+
+### Main procedures
+
+Take RoIAlign plugin `roi_align` for example.
+
+1. Add header `trt_roi_align.hpp` to TensorRT include directory `mmcv/ops/csrc/tensorrt/`
+2. Add source `trt_roi_align.cpp` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
+3. Add cuda kernel `trt_roi_align_kernel.cu` to TensorRT source directory `mmcv/ops/csrc/tensorrt/plugins/`
+4. Register `roi_align` plugin in [trt_plugin.cpp](https://github.com/open-mmlab/mmcv/blob/master/mmcv/ops/csrc/tensorrt/plugins/trt_plugin.cpp)
+
+    ```c++
+    #include "trt_plugin.hpp"
+
+    #include "trt_roi_align.hpp"
+
+    REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator);
+
+    extern "C" {
+    bool initLibMMCVInferPlugins() { return true; }
+    }  // extern "C"
+    ```
+
+5. Add unit test into `tests/test_ops/test_tensorrt.py`
+   Check [here](https://github.com/open-mmlab/mmcv/blob/master/tests/test_ops/test_tensorrt.py) for examples.
+
+### Reminders
+
+- Some of the [custom ops](https://mmcv.readthedocs.io/en/latest/ops.html) in `mmcv` have their cuda implementations, which could be refered.
+
+## Known Issues
+
+- None
+
+## References
+
+- [Developer guide of Nvidia TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html)
+- [TensorRT Open Source Software](https://github.com/NVIDIA/TensorRT)
+- [onnx-tensorrt](https://github.com/onnx/onnx-tensorrt)
+- [TensorRT python API](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/index.html)
+- [TensorRT c++ plugin API](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin.html)