Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Repair output binding indexing scheme in TRT #2054

Merged
merged 1 commit into from
Jun 23, 2023

Conversation

gs-olive
Copy link
Collaborator

Description

  • Output binding indices are not guaranteed to be strictly after input binding indices in value
  • Instead, we can use the output binding map as the ground truth for the mapping from TRT indices to PyTorch Tensor indices
  • This bug is encountered when using TRTModuleNext on certain models (see issue below)

Fixes #2053
Addresses Second Bug in #1565

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

- Output binding indices are not guaranteed to be strictly after input
binding indices in value
- Instead, we can use the output binding map as the ground truth for the
mapping from TRT indices to PyTorch Tensor indices
@gs-olive gs-olive requested a review from narendasan June 22, 2023 22:17
@gs-olive gs-olive self-assigned this Jun 22, 2023
@github-actions github-actions bot added the component: core Issues re: The core compiler label Jun 22, 2023
@github-actions github-actions bot requested a review from bowang007 June 22, 2023 22:17
Comment on lines -158 to -159
for (size_t o = inputs.size(); o < (compiled_engine->num_io.first + compiled_engine->num_io.second); o++) {
uint64_t pyt_idx = compiled_engine->out_binding_map[o];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of the detectron model in #2053, the output binding map was of the form:

DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg264_1 has TensorRT binding index: 0, Torch binding index: 0
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg268_1 has TensorRT binding index: 1, Torch binding index: 1
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: add_121 has TensorRT binding index: 2, Torch binding index: 2
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: clamp_51 has TensorRT binding index: 3, Torch binding index: 3
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg72_1 has TensorRT binding index: 7, Torch binding index: 4
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg73_1 has TensorRT binding index: 8, Torch binding index: 5
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg74_1 has TensorRT binding index: 9, Torch binding index: 6
DEBUG: [Torch-TensorRT - Debug Build] - Input binding name: arg75_1 has TensorRT binding index: 10, Torch binding index: 7
DEBUG: [Torch-TensorRT - Debug Build] - Output binding name: output0 has TensorRT binding index: 4, Torch binding index: 8
DEBUG: [Torch-TensorRT - Debug Build] - Output binding name: output1 has TensorRT binding index: 5, Torch binding index: 9
DEBUG: [Torch-TensorRT - Debug Build] - Output binding name: output2 has TensorRT binding index: 6, Torch binding index: 10
DEBUG: [Torch-TensorRT - Debug Build] - Output binding name: output3 has TensorRT binding index: 11, Torch binding index: 11

Some of the output bindings have TRT indices which are smaller than the number of input bindings, causing errors when the out_binding_map is accessed with a fixed index like o, as above.

Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gs-olive gs-olive merged commit 6e4aa0b into pytorch:main Jun 23, 2023
8 checks passed
@gs-olive gs-olive deleted the trt_runtime_output_fix branch June 23, 2023 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 [Bug] Encountered bug when using TRTModuleNext in Dynamo
3 participants