Quantitative problems for int8 #10240
lingl-space
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The basic model is used for precision=fp32 reasoning and the corresponding slim model is used for precision=int8 reasoning respectively. However, the final inference time obtained is that when ocr.ocr() is run in main.cpp, the inference time using fp32 only needs half of that of int8. The acceleration purpose of int8 quantification is the exact opposite
Beta Was this translation helpful? Give feedback.
All reactions