Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trtexec fails to build /mobile_sam_mask_decoder.onnx #16

Open
fdarvas opened this issue Dec 18, 2023 · 7 comments
Open

trtexec fails to build /mobile_sam_mask_decoder.onnx #16

fdarvas opened this issue Dec 18, 2023 · 7 comments

Comments

@fdarvas
Copy link

fdarvas commented Dec 18, 2023

Trying to run:

trtexec --onnx=data/mobile_sam_mask_decoder.onnx --saveEngine=data/mobile_sam_mask_decoder.engine --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:1x1x2,point_labels:1x1 --maxShapes=point_coords:1x10x2,point_labels:1x10

after successfully exporting mobile_sam_mask_decoder.onnx with:
python3 -m nanosam.tools.export_sam_mask_decoder_onnx --model-type=vit_t --checkpoint=assets/mobile_sam.pt --output=/mnt/e/data/mobile_sam_mask_decoder.onnx

resulting in this error:

onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/18/2023-11:39:43] [E] Error[4]: [graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/18/2023-11:39:43] [E] [TRT] ModelImporter.cpp:771: While parsing node number 146 [Tile -> "/Tile_output_0"]:
[12/18/2023-11:39:43] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[12/18/2023-11:39:43] [E] [TRT] ModelImporter.cpp:773: input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"

[12/18/2023-11:39:43] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[12/18/2023-11:39:43] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /Tile
[graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[12/18/2023-11:39:43] [E] Failed to parse onnx file
[12/18/2023-11:39:43] [I] Finished parsing network model. Parse time: 0.32614
[12/18/2023-11:39:43] [E] Parsing model failed
[12/18/2023-11:39:43] [E] Failed to create engine from model or file.
[12/18/2023-11:39:43] [E] Engine set up failed

@Awesome0324
Copy link

I had the same problem.
Have you solved it yet?

@fdarvas
Copy link
Author

fdarvas commented Mar 5, 2024

Unfortunately I dont have a solution for it yet.

@Rich2020
Copy link

Rich2020 commented Mar 7, 2024

Bump...

Same issue - any help would be much appreciated; thanks!

@fwcore
Copy link

fwcore commented Mar 11, 2024

Two possible workarounds (use either one):

polygraphy surgeon sanitize data/mobile_sam_mask_decoder.onnx --fold-constants -o data/mobile_sam_mask_decoder_folded.onnx --fold-size-threshold 64

You might also need to install onnx-graphsurgeon

The resulted ONNX files obtained by above have no OneHot op, and can be converted to TensorRT with no problem.


More details

Using Netron to inspect the file, the ONNX file converted by the following command has OneHot op.

python3 -m nanosam.tools.export_sam_mask_decoder_onnx --model-type=vit_t --checkpoint=assets/mobile_sam.pt --output=/mnt/e/data/mobile_sam_mask_decoder.onnx

However, the ONNX file provided by the google drive link in README.md does not have OneHot op. It seems to be replaced by some constant tensors and where op. I don't know how this file is converted from scratch.

@binh234
Copy link

binh234 commented Mar 20, 2024

Two possible workarounds (use either one):

  • Use torch==2.0.1, the latest torch version will export OneHot op in the output ONNX file, which can not be parsed with TensorRT
  • Replace file nanosam/mobile_sam/modeling/mask_decoder.py by this gist code, this works well for most of torch versions

@songhat
Copy link

songhat commented Sep 26, 2024

Two possible workarounds (use either one):

  • Use torch==2.0.1, the latest torch version will export OneHot op in the output ONNX file, which can not be parsed with TensorRT
  • Replace file nanosam/mobile_sam/modeling/mask_decoder.py by this gist code, this works well for most of torch versions

Awsome!Could you tell why it woks?

@fwcore
Copy link

fwcore commented Sep 27, 2024

Two possible workarounds (use either one):

  • Use torch==2.0.1, the latest torch version will export OneHot op in the output ONNX file, which can not be parsed with TensorRT
  • Replace file nanosam/mobile_sam/modeling/mask_decoder.py by this gist code, this works well for most of torch versions

Awsome!Could you tell why it woks?

This is the changed part of the above gist code (https://gist.github.com/binh234/2bb4fb5be3066460825786ba7d46c55c#file-mask_decoder-py-L126):

        if torch.onnx.is_in_onnx_export():
            pos_src = image_pe.repeat([tokens.shape[0]] + [1] * (len(image_pe.shape) - 1))
            src = image_embeddings.repeat(
                [tokens.shape[0]] + [1] * (len(image_embeddings.shape) - 1)
            )
        else:
            pos_src = torch.repeat_interleave(image_pe, tokens.shape[0], dim=0)
            src = torch.repeat_interleave(image_embeddings, tokens.shape[0], dim=0)

@binh234 helps to change torch.repeat_interleave to torch.tensor.repeat.

This PR in pytorch ([ONNX] Simplify repeat_intereleave export for scalar-valued 'repeat'
#100575
) changes the onnx-exporting of torch.repeat_interleave to use OneHot op.
image

It's merged on May 6, 2023, starting to change the onnx-exporting behavior from pytorch version PyTorch 2.1.0. Please check here: https://github.com/pytorch/pytorch/blame/v2.1.0-rc1/torch/onnx/symbolic_helper.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants