Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As the number of CPU cores decreases, the BLS mode processing time increases #7373

Open
callmezhangchenchenokay opened this issue Jun 25, 2024 · 0 comments

Comments

@callmezhangchenchenokay
Copy link

callmezhangchenchenokay commented Jun 25, 2024

Description
BLS mode calls a TensorRT backend model hundreds of times, and the processing time increases as the number of CPU cores decreases
Triton Information
nvcr.io/nvidia/tritonserver:24.05-py3

To Reproduce
My BLS code looks like this: model.py in BLS calls t2s_sdec, platform: "tensorrt_plan"
341697488-bff9ee2e-8dee-4261-a41b-3be820873f7f

model transformation

/usr/src/tensorrt/bin/trtexec --onnx=nahida_t2s_encoder_sim.onnx \
--shapes=ref_seq:1x40,text_seq:1x100,ref_bert:40x1024,text_bert:100x1024,ssl_content:1x768x350  \
--minShapes=ref_seq:1x1,text_seq:1x1,ref_bert:1x1024,text_bert:1x1024,ssl_content:1x768x240 \
--optShapes=ref_seq:1x40,text_seq:1x100,ref_bert:40x1024,text_bert:100x1024,ssl_content:1x768x350 \
--maxShapes=ref_seq:1x500,text_seq:1x500,ref_bert:500x1024,text_bert:500x1024,ssl_content:1x768x500 \
--saveEngine=nahida_t2s_encoder_sim.engine

As the for loop increases, the input gradually becomes larger
Set in t2s sdec/config.json parameters: {key: "FORCE_CPU_ONLY_INPUT_TENSORS" value: {string_value:"no"}}
When the number of CPU cores is 100, 387 times , the totaltime is 2s, the other time is 300ms
When the number of CPU cores is 24, 387 times, the totaltime is 5s, the other time is 600ms

The change in number of CPU cores is set when docker is started --cpuset-cpus=0-23
There is no interference from other processes
Expected behavior
I hope the decrease in the number of CPU cores will not affect the overall process time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant