Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Quantization speedup]support TensorRT8.0.0 #3866

Merged
merged 3 commits into from
Jul 9, 2021

Conversation

linbinskn
Copy link
Contributor

This PR aims to support the latest TensorRT version. Current quantization speedup tool is implemented based on TensorRT7.0. However, the latest tensorrt python api has changed which separates network definition and configuration after version 8.0. All configuration of low precision have been moved to IBuilderConfig and our current implementation can't work on it (the problem raised in issue #3857 ). This PR supports the new TensorRT version and the new api.

@J-shang
Copy link
Contributor

J-shang commented Jul 1, 2021

this means we upgrade TensorRT dependency to >= 8.0? Is these upgrade available for most users?
And pls update the quantization speed up doc for this change.

@linbinskn
Copy link
Contributor Author

this means we upgrade TensorRT dependency to >= 8.0? Is these upgrade available for most users?
And pls update the quantization speed up doc for this change.

@J-shang Good point! Have updated quantization speedup doc. I think the departure of network definition and configuration in latest TensorRT version is rational, and support the latest version is necessary for us. I believe most of people will use the latest version especially these people who want to try mixed precision in TensorRT.

@J-shang J-shang merged commit a4760ce into microsoft:master Jul 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants