Exception in batch inference with SDK #839

Mo-Kanya · 2022-07-29T12:05:19Z

Thank you for the great work,

I exported a Faster R-CNN TRT model with your tools, and it works fine using the inference API from Model Converter. However, when I use the MMDeploy SDK (Python API) to do inference, I get tons of [warning] [bulk.h:39] fallback Bulk implementation and the model fails to do batch inference (it infers one by one whatever the size of the input image list is).

Please tell me if you need any information.

lvhan028 · 2022-07-29T12:15:14Z

[warning] [bulk.h:39] fallback Bulk implementation has nothing to do with this issue. You can use SPDLOG_LEVEL=error python demo/python/object_detection.py cuda model_path image_path to suppress the warning.

Please post your environment info by running python tools/check_env.py.

Also, can you share your code about using SDK Python API?

Mo-Kanya · 2022-07-29T12:25:43Z

Thanks for your timely reply.

2022-07-29 12:19:05,439 - mmdeploy - INFO -

2022-07-29 12:19:05,439 - mmdeploy - INFO - Environmental information
2022-07-29 12:19:05,709 - mmdeploy - INFO - sys.platform: linux
2022-07-29 12:19:05,710 - mmdeploy - INFO - Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
2022-07-29 12:19:05,710 - mmdeploy - INFO - CUDA available: True
2022-07-29 12:19:05,710 - mmdeploy - INFO - GPU 0: Quadro GV100
2022-07-29 12:19:05,710 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2022-07-29 12:19:05,710 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
2022-07-29 12:19:05,710 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
2022-07-29 12:19:05,710 - mmdeploy - INFO - PyTorch: 1.12.0
2022-07-29 12:19:05,710 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.3.2 (built against CUDA 11.5)
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

2022-07-29 12:19:05,710 - mmdeploy - INFO - TorchVision: 0.13.0
2022-07-29 12:19:05,710 - mmdeploy - INFO - OpenCV: 4.6.0
2022-07-29 12:19:05,710 - mmdeploy - INFO - MMCV: 1.6.0
2022-07-29 12:19:05,710 - mmdeploy - INFO - MMCV Compiler: GCC 9.3
2022-07-29 12:19:05,710 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2022-07-29 12:19:05,710 - mmdeploy - INFO - MMDeploy: 0.6.0+8bdf1cf
2022-07-29 12:19:05,710 - mmdeploy - INFO -

2022-07-29 12:19:05,710 - mmdeploy - INFO - Backend information
2022-07-29 12:19:06,201 - mmdeploy - INFO - onnxruntime: 1.8.1 ops_is_avaliable : True
2022-07-29 12:19:06,240 - mmdeploy - INFO - tensorrt: 7.2.3.4 ops_is_avaliable : True
2022-07-29 12:19:06,259 - mmdeploy - INFO - ncnn: None ops_is_avaliable : False
2022-07-29 12:19:06,260 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-07-29 12:19:06,261 - mmdeploy - INFO - openvino_is_avaliable: False
2022-07-29 12:19:06,261 - mmdeploy - INFO -

2022-07-29 12:19:06,261 - mmdeploy - INFO - Codebase information
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmdet: 2.24.0
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmseg: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmcls: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmocr: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmedit: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmdet3d: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmpose: None
2022-07-29 12:19:06,262 - mmdeploy - INFO - mmrotate: None

Mo-Kanya · 2022-07-29T12:29:40Z

And the code

from build.lib.mmdeploy_python import Detector
from os import listdir
from os.path import isfile, join
import cv2
import time

model_path = '/root/workspace/mmdeploy_models/trt/faster-rcnn-static/'
image_path = '/'.join(('/root/workspace/mmdetection', 'demo/demo.jpg'))

batch_size = 1
total = 512
mypath = '/root/workspace/mmdetection/tests/data/VOCdevkit/VOC2012/JPEGImages'
onlyfiles = [join(mypath, f) for f in listdir(mypath) if isfile(join(mypath, f))]

detector = Detector(model_path, 'cuda', 0)
total_time = 0
warmup = int(512/batch_size)
batch_num = int(total/batch_size)

for i in range(warmup+batch_num):
   imgs = [cv2.imread(image_path) for image_path in onlyfiles[i*batch_size: (i+1)*batch_size]]
   if i<warmup:
       bboxes, labels, _ = detector(imgs)[0]
   else:
       start = time.time()
       bboxes, labels, _ = detector(imgs)[0]
       end = time.time()
       total_time += (end-start)

print("\nTest over {} images with a batch size {}: mean time {} per image\n".format(total, batch_size, total_time*1.0/total))

The model_path stores the converter output files. I tested with different settings with TRT backend, like dynamic batch size or a static size of 4. Both of them have the same issue with the SDK.

Mo-Kanya · 2022-08-04T02:14:30Z

Hi,
Could you please tell me if this exception can be reproduced (or not) in any environment? I wonder If it is a bug or just due to my misuse.

lzhangzz · 2022-08-05T04:28:28Z

Just ignore the warning about bulk implementations, it has nothing to do with batch inference.

Batch inference in SDK is experimental and must be turned on explicitly in the configuration file. In pipeline.json of the model, insert the field "is_batched": true into the config of the task which module is Net:

{
    "name": "yolox",
    "type": "Task",
    "module": "Net",
    "is_batched": true,   // <--
    "input": ["prep_output"],
    "output": ["infer_output"],
    "input_map": {"img": "input"}
}

and be aware that after preprocess, images must be of the same size to form a batch.

Mo-Kanya · 2022-08-05T13:07:51Z

Thanks, I can verify that this works.

lvhan028 assigned lzhangzz Jul 29, 2022

Mo-Kanya changed the title ~~Warning: fallback Bulk implementation~~ Exception in batch inference with SDK Aug 4, 2022

Mo-Kanya closed this as completed Aug 5, 2022

lzhangzz mentioned this issue Aug 18, 2022

how to do batch inference? #917

Closed

lzhangzz mentioned this issue Sep 7, 2022

Tensorrt batch inference is much slower when batch size is larger. #1013

Closed

huliang2016 mentioned this issue Sep 28, 2022

CRNN TensorRT model batch inference result is strange #1128

Closed

3 tasks

irexyc mentioned this issue Sep 6, 2023

model convert for batch>1 in visualize [Bug] #2405

Closed

3 tasks

irexyc mentioned this issue Sep 19, 2023

Does the Inference SDK support batch inference？ #2444

Closed

3 tasks

irexyc mentioned this issue Oct 16, 2023

[Feature] Triton server #2088

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception in batch inference with SDK #839

Exception in batch inference with SDK #839

Mo-Kanya commented Jul 29, 2022

lvhan028 commented Jul 29, 2022 •

edited

Loading

Mo-Kanya commented Jul 29, 2022

Mo-Kanya commented Jul 29, 2022 •

edited

Loading

Mo-Kanya commented Aug 4, 2022

lzhangzz commented Aug 5, 2022 •

edited

Loading

Mo-Kanya commented Aug 5, 2022

Exception in batch inference with SDK #839

Exception in batch inference with SDK #839

Comments

Mo-Kanya commented Jul 29, 2022

lvhan028 commented Jul 29, 2022 • edited Loading

Mo-Kanya commented Jul 29, 2022

Mo-Kanya commented Jul 29, 2022 • edited Loading

Mo-Kanya commented Aug 4, 2022

lzhangzz commented Aug 5, 2022 • edited Loading

Mo-Kanya commented Aug 5, 2022

lvhan028 commented Jul 29, 2022 •

edited

Loading

Mo-Kanya commented Jul 29, 2022 •

edited

Loading

lzhangzz commented Aug 5, 2022 •

edited

Loading