-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantization User Guide edits #3348
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
|
||
Use Cases | ||
######################## | ||
AIMET model quantization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the first character of each word capital in the titles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Al said to use the Microsoft style guide. Microsoft uses sentence-style capitalization in titles. We can discuss varying from that if it's important.
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me from a rst perspective. I would like to see a rendered example as well.
Docs/user_guide/model_guidelines.rst
Outdated
| |x = x.view(-1, 1024) | | | ||
| |If view function is written | | | ||
| |as above, it causes an issue | | | ||
| |forward function as follows: | `x = x.view(x.size(0), -1)` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reference python function using :func:
to have better readability?
For example, :func:x.view()
, :func:x.view(x.size(0), -1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reformatted the page, including code samples.
|
||
.. image:: ../images/quant_use_case_2.PNG | ||
|
||
AIMET Quantization Features | ||
_aimet-quantization-features: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this page even exist? _aimet-quantization-features:
Let's make sure before cross-referencing it.
than post-training quantization. However, the higher accuracy comes with the usual costs of neural | ||
network training, i.e. longer training times, need for labeled data and hyperparameter search. | ||
|
||
When post-training quantizatio (PTQ) doesn't sufficiently reduce quantization error, the next step is to use quantization-aware training (QAT). QAT finds more accurate solutions than PTQ by modeling the quantization noise during training. This higher accuracy comes at the usual cost of neural network training, including longer training times and the need for labeled data and hyperparameter search. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
Change: "QAT finds more accurate solutions than PTQ by modeling the quantization noise during training"
To: "QAT finds more optimal solutions than PTQ by finetuning the model parameters in the presence of quantization noise"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quic-mtuttle I made this change, but used "better-optimized" rather than "more optimal". Question: What does "optimal" mean here if not "accurate"?
When set to "False" or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use strict symmetric quantization. | ||
Optional. If included, value is "True" or "False". | ||
"True" causes quantizers configured in symmetric mode to use strict symmetric quantization. | ||
"False", or omitting the parameter, causes quantizers configured in symmetric mode to not use strict symmetric quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change in indent here messes with the formatting
When set to "False" or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use unsigned symmetric quantization. | ||
Optional. If included, value is "True" or "False". | ||
"True" causes quantizers configured in symmetric mode use unsigned symmetric quantization when available. | ||
"False", or omitting the parameter, causes quantizers configured in symmetric mode to not use unsigned symmetric quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on indentation
When set to "False" or omitting the parameter altogether, parameter quantizers will use per tensor quantization. | ||
Optional. If included, value is "True" or "False". | ||
"True" causes parameter quantizers to use per-channel quantization rather than per-tensor quantization. | ||
When set to "False" or omitting the parameter, causes parameter quantizers to use per-tensor quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Can make the phrasing here consistent with the other flags
When set to "False", parameter quantizers of this op type will use per tensor quantization. | ||
By omitting the setting, parameter quantizers of this op type will fall back to the setting specified by the defaults section. | ||
Optional. If included, value is "True" or "False". | ||
"True" sets parameter quantizers of this op type to use per-channel quantization rather than per-tensor quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on indentation
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>
Grammar and style edits to the Quantization User Guide.