Apart from deploy.py
, there are other useful tools under the tools/
directory.
This tool can be used to convert PyTorch model from OpenMMLab to ONNX.
python tools/torch2onnx.py \
${DEPLOY_CFG} \
${MODEL_CFG} \
${CHECKPOINT} \
${INPUT_IMG} \
--work-dir ${WORK_DIR} \
--device cpu \
--log-level INFO
deploy_cfg
: The path of the deploy config file in MMDeploy codebase.model_cfg
: The path of model config file in OpenMMLab codebase.checkpoint
: The path of the model checkpoint file.img
: The path of the image file used to convert the model.--work-dir
: Directory to save output ONNX models Default is./work-dir
.--device
: The device used for conversion. If not specified, it will be set tocpu
.--log-level
: To set log level which in'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'
. If not specified, it will be set toINFO
.
ONNX model with Mark
nodes in it can be partitioned into multiple subgraphs. This tool can be used to extract the subgraph from the ONNX model.
python tools/extract.py \
${INPUT_MODEL} \
${OUTPUT_MODEL} \
--start ${PARITION_START} \
--end ${PARITION_END} \
--log-level INFO
input_model
: The path of input ONNX model. The output ONNX model will be extracted from this model.output_model
: The path of output ONNX model.--start
: The start point of extracted model with format<function_name>:<input/output>
. Thefunction_name
comes from the decorator@mark
.--end
: The end point of extracted model with format<function_name>:<input/output>
. Thefunction_name
comes from the decorator@mark
.--log-level
: To set log level which in'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'
. If not specified, it will be set toINFO
.
To support the model partition, you need to add Mark nodes in the ONNX model. The Mark node comes from the @mark
decorator.
For example, if we have marked the multiclass_nms
as below, we can set end=multiclass_nms:input
to extract the subgraph before NMS.
@mark('multiclass_nms', inputs=['boxes', 'scores'], outputs=['dets', 'labels'])
def multiclass_nms(*args, **kwargs):
"""Wrapper function for `_multiclass_nms`."""
This tool helps to convert an ONNX
model to an PPLNN
model.
python tools/onnx2pplnn.py \
${ONNX_PATH} \
${OUTPUT_PATH} \
--device cuda:0 \
--opt-shapes [224,224] \
--log-level INFO
onnx_path
: The path of theONNX
model to convert.output_path
: The convertedPPLNN
algorithm path in json format.device
: The device of the model during conversion.opt-shapes
: Optimal shapes for PPLNN optimization. The shape of each tensor should be wrap with "[]" or "()" and the shapes of tensors should be separated by ",".--log-level
: To set log level which in'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'
. If not specified, it will be set toINFO
.
This tool can be used to convert ONNX to TensorRT engine.
python tools/onnx2tensorrt.py \
${DEPLOY_CFG} \
${ONNX_PATH} \
${OUTPUT} \
--device-id 0 \
--log-level INFO \
--calib-file /path/to/file
deploy_cfg
: The path of the deploy config file in MMDeploy codebase.onnx_path
: The ONNX model path to convert.output
: The path of output TensorRT engine.--device-id
: The device index, default to0
.--calib-file
: The calibration data used to calibrate engine to int8.--log-level
: To set log level which in'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'
. If not specified, it will be set toINFO
.
This tool helps to convert an ONNX
model to an ncnn
model.
python tools/onnx2ncnn.py \
${ONNX_PATH} \
${NCNN_PARAM} \
${NCNN_BIN} \
--log-level INFO
onnx_path
: The path of theONNX
model to convert from.output_param
: The convertedncnn
param path.output_bin
: The convertedncnn
bin path.--log-level
: To set log level which in'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'
. If not specified, it will be set toINFO
.
This tool helps to test latency of models with PyTorch, TensorRT and other backends. Note that the pre- and post-processing is excluded when computing inference latency.
python tools/profiler.py \
${DEPLOY_CFG} \
${MODEL_CFG} \
${IMAGE_DIR} \
--model ${MODEL} \
--device ${DEVICE} \
--shape ${SHAPE} \
--num-iter ${NUM_ITER} \
--warmup ${WARMUP} \
--cfg-options ${CFG_OPTIONS} \
--batch-size ${BATCH_SIZE} \
--img-ext ${IMG_EXT}
deploy_cfg
: The path of the deploy config file in MMDeploy codebase.model_cfg
: The path of model config file in OpenMMLab codebase.image_dir
: The directory to image files that used to test the model.--model
: The path of the model to be tested.--shape
: Input shape of the model byHxW
, e.g.,800x1344
. If not specified, it would useinput_shape
from deploy config.--num-iter
: Number of iteration to run inference. Default is100
.--warmup
: Number of iteration to warm-up the machine. Default is10
.--device
: The device type. If not specified, it will be set tocuda:0
.--cfg-options
: Optional key-value pairs to be overrode for model config.--batch-size
: the batch size for test inference. Default is1
. Note that not all models supportbatch_size>1
.--img-ext
: the file extensions for input images fromimage_dir
. Defaults to['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif']
.
python tools/profiler.py \
configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py \
../mmclassification/configs/resnet/resnet18_8xb32_in1k.py \
../mmclassification/demo/ \
--model work-dirs/mmcls/resnet/trt/end2end.engine \
--device cuda \
--shape 224x224 \
--num-iter 100 \
--warmup 10 \
--batch-size 1
And the output look like this:
----- Settings:
+------------+---------+
| batch size | 1 |
| shape | 224x224 |
| iterations | 100 |
| warmup | 10 |
+------------+---------+
----- Results:
+--------+------------+---------+
| Stats | Latency/ms | FPS |
+--------+------------+---------+
| Mean | 1.535 | 651.656 |
| Median | 1.665 | 600.569 |
| Min | 1.308 | 764.341 |
| Max | 1.689 | 591.983 |
+--------+------------+---------+