Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ORT 1.20.1 Release] Cherry pick 2nd round #22845

Merged
merged 5 commits into from
Nov 19, 2024

Conversation

yf711
Copy link
Contributor

@yf711 yf711 commented Nov 14, 2024

…n TRT (#22681)

Add new provider option `trt_op_types_to_exclude`:
- User can provide op type list to be excluded from running on TRT
- e.g. `trt_op_types_to_exclude="MaxPool"`

There is a known performance issue with the DDS ops (NonMaxSuppression,
NonZero and RoiAlign) from TRT versions 10.0 to 10.7. TRT EP excludes
DDS ops from running on TRT by default, user can override default value
with empty string to include all ops.
@jywu-msft jywu-msft requested a review from chilo-ms November 15, 2024 19:45
@chilo-ms
Copy link
Contributor

chilo-ms commented Nov 16, 2024

Could we consider cherry pick this PR?
#22863

Update: We reverted TRT EP's change, so this PR is no longer needed.

liqunfu and others added 3 commits November 17, 2024 23:58
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Signed-off-by: Liqun Fu <liqun.fu@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
TRT EP excludes DDS ops from running on TRT and doesn't allow users to
change.
This PR is for ORT 1.20.1 patch release.

We will have better solution to add a new provider option to exclude
specific ops, similar to following:
#22863
#22681
### Description
- Updates pipelines to use QNN SDK 2.28.2.241116.
- Re-enable LayerNormalization unit tests that failed with accuracy
errors with the previous QNN SDK (2.28.0).
- Update QNN EP to no longer provide a dummy bias for LayerNorm if the
QNN SDK version is >= 2.28.0.


### Motivation and Context
Use the latest QNN SDK. This version improves inference latency for
certain customer models.
@yf711 yf711 requested a review from a team as a code owner November 19, 2024 04:11
@yf711 yf711 merged commit 5c1b7cc into rel-1.20.1 Nov 19, 2024
238 of 245 checks passed
@yf711 yf711 deleted the yifanl/round-2-cherry-pick-rel-1.20.1 branch November 19, 2024 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants