Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor sizes are different onnx model and DLA loadable engine outputs #17

Closed
AnhPC03 opened this issue Jan 23, 2024 · 2 comments
Closed
Assignees

Comments

@AnhPC03
Copy link

AnhPC03 commented Jan 23, 2024

Hello,
I could run your repo and when I printed input and output tensors size, the values were

[hybrid mode] create cuDLA device SUCCESS
[hybrid mode] load cuDLA module from memory SUCCESS
[hybrid mode] cuDLA module get number of input tensors SUCCESS
[hybrid mode] cuDLA module get number of output tensors SUCCESS
[hybrid mode] cuDLA module get input tensors descriptors SUCCESS
[hybrid mode] cuDLA module get output tensors descriptors SUCCESS
Input tensor size: 1806336 (1x4x672x672)
Output tensor size 0: 3612672 (1x?x?x?)
Output tensor size 1: 903168 (1x?x?x?)
Output tensor size 2: 225792 (1x?x?x?)
[hybrid mode] register cuda input tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS
[hybrid mode] register cuda output tensor memory to cuDLA SUCCESS

But the converted onnx model has this values, I saw on Netron having same values

Input tensor size: 1354752 (1x3x672x627)
Output tensor size 0: 1799280 (1x255x84x84)
Output tensor size 1: 449820 (1x255x42x42)
Output tensor size 2: 112455 (1x255x21x21)

If I converted onnx model to .engine only for inferencing using GPU

${TRTEXEC} --shapes=images:1x3x672x672 --onnx=data/model/yolov5_trimmed_qat_noqdq.onnx --saveEngine=data/gpu/yolov5.int8.int8chw32in.fp16chw16out.engine --inputIOFormats=int8:chw32 --outputIOFormats=fp16:chw16 --int8 --fp16 --calib=data/model/qat2ptq.cache --precisionConstraints=prefer --layerPrecisions="/model.24/m.0/Conv":fp16,"/model.24/m.1/Conv":fp16,"/model.24/m.2/Conv":fp16

This having the same tensor size with Netron. And I couldn't inference this .engine using GPU.

How can I convert only inferencing using GPU but having same tensor size with yours DLA loadable?
Thank you very much.

@lynettez
Copy link
Collaborator

lynettez commented Feb 1, 2024

Hey @AnhPC03, sorry for late reply. Did you try with "--minShapes=images:1x3x672x672 --maxShapes=images:1x3x672x672 --optShapes=images:1x3x672x672 --shapes=images:1x3x672x672" to specific the input shape range?

@lynettez lynettez self-assigned this Feb 1, 2024
@lynettez
Copy link
Collaborator

lynettez commented Sep 2, 2024

closing since no activity for several months, thanks!

@lynettez lynettez closed this as completed Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants