Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorRT EP] mem leak fix #22863

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,5 @@ struct OrtTensorRTProviderOptionsV2 {
const char* trt_op_types_to_exclude{"NonMaxSuppression,NonZero,RoiAlign"}; // Exclude specific ops from running on TRT.
// There is a known performance issue with the DDS ops (NonMaxSuppression, NonZero and RoiAlign) from TRT versions 10.0 to 10.7.
// TRT EP excludes DDS ops from running on TRT by default, user can override default value with empty string to include all ops.
int trt_op_types_to_exclude_str_is_dynamic_allocation = 0; // Indicate trt_op_types_to_exclude points to a static allocation or dynamic allocation. It's for internal use to free the dynamic allocation buffer.
};
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,9 @@ void TensorrtExecutionProviderInfo::UpdateProviderOptions(void* provider_options
trt_provider_options_v2.trt_engine_hw_compatible = internal_options.engine_hw_compatible;
trt_provider_options_v2.trt_onnx_bytestream = internal_options.onnx_bytestream;
trt_provider_options_v2.trt_onnx_bytestream_size = internal_options.onnx_bytestream_size;
trt_provider_options_v2.trt_op_types_to_exclude = copy_string_if_needed(internal_options.op_types_to_exclude);
if (options.find("trt_op_types_to_exclude") != options.end()) {
trt_provider_options_v2.trt_op_types_to_exclude = copy_string_if_needed(internal_options.op_types_to_exclude);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_string_if_needed() doesn't always dynamically allocate memory right? only if string_copy is true?
this is_dynamic_allocation check looks ugly.

Copy link
Contributor Author

@chilo-ms chilo-ms Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can add a check that only if string_copy is true, we set is_dynamic_allocation to true.

yeah, agree the is_dynamic_allocation is ugly. But i can't think of a better solution without having a flag.
The reason is options_v2.trt_op_types_to_exclude which is a const char* can be static allocation or dynamic allocation.

  • static allocation - When instantiate an OrtTensorRTProviderOptionsV2 instance, the trt_op_types_to_exclude is static allocation.
  • dynamic allocation - When user calls OrtApis::UpdateTensorRTProviderOptions and specify other value to trt_op_types_to_exclude, then it's dynamic allocation.

So, once ReleaseTensorRTProviderOptions is being called, it's hard to tell that trt_op_types_to_exclude is static allocation or dynamic allocation. (Note: We can't delete a static allocation buffer). That's why we might need a flag here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an internal file. Why don't we just use a std::string? Then there will be no need to manually delete[] buffers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We later discussed we will close this PR due to ugly handling of the buffers.

if (string_copy) trt_provider_options_v2.trt_op_types_to_exclude_str_is_dynamic_allocation = 1;
}
Copy link
Member

@jywu-msft jywu-msft Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we dynamically allocate the ""NonMaxSuppression,NonZero,RoiAlign" string in the else case here if we don't find trt_op_types_to_exclude in options. then we would always need to deallocate it.

Copy link
Contributor Author

@chilo-ms chilo-ms Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we dynamically allocate the ""NonMaxSuppression,NonZero,RoiAlign" string in the else case here if we don't find trt_op_types_to_exclude in options.

i assume you mean the default value "NonMaxSuppression,NonZero,RoiAlign" of trt_op_types_to_exclude, it's static allocation, no need to deallocate. it will be deallocated automatically by the program.

Otherwise, If user uses OrtApis::UpdateTensorRTProviderOptions to update the OrtTensorRTProviderOptionsV2 instance, i don't think the case you mention exist.

if user don't use our API and directly update trt_op_types_to_exclude in the OrtTensorRTProviderOptionsV2 instance, it's user's responsibility to deallocate the buffer.

}
} // namespace onnxruntime
2 changes: 1 addition & 1 deletion onnxruntime/core/session/provider_bridge_ort.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2413,7 +2413,7 @@ ORT_API(void, OrtApis::ReleaseTensorRTProviderOptions, _Frees_ptr_opt_ OrtTensor
delete[] ptr->trt_profile_opt_shapes;
delete[] ptr->trt_ep_context_file_path;
delete[] ptr->trt_onnx_model_folder_path;
if (!ptr->trt_op_types_to_exclude) delete[] ptr->trt_op_types_to_exclude;
if (ptr->trt_op_types_to_exclude && ptr->trt_op_types_to_exclude_str_is_dynamic_allocation) delete[] ptr->trt_op_types_to_exclude;
}

std::unique_ptr<OrtTensorRTProviderOptionsV2> p(ptr);
Expand Down
Loading