Quantization User Guide edits #3348

dwelsch-esi · 2024-09-17T03:43:47Z

Grammar and style edits to the Quantization User Guide.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

quic-hitameht · 2024-09-17T18:12:26Z

Docs/user_guide/model_quantization.rst

-
-Use Cases
+########################
+AIMET model quantization


Can we make the first character of each word capital in the titles?

Al said to use the Microsoft style guide. Microsoft uses sentence-style capitalization in titles. We can discuss varying from that if it's important.

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

quic-jpolizzi

This all looks good to me from a rst perspective. I would like to see a rendered example as well.

Docs/user_guide/model_guidelines.rst

quic-hitameht · 2024-09-23T16:19:19Z

Docs/user_guide/model_guidelines.rst

-|                        |x = x.view(-1, 1024)          |                                   |
-|                        |If view function is written   |                                   |
-|                        |as above, it causes an issue  |                                   |
+|                        |forward function as follows:  | `x = x.view(x.size(0), -1)`       |


Can we reference python function using :func: to have better readability?
For example, :func:x.view(), :func:x.view(x.size(0), -1)

Reformatted the page, including code samples.

quic-hitameht · 2024-09-23T16:38:14Z

Docs/user_guide/model_quantization.rst


    .. image:: ../images/quant_use_case_2.PNG

-AIMET Quantization Features
+_aimet-quantization-features:


Does this page even exist? _aimet-quantization-features: Let's make sure before cross-referencing it.

Docs/user_guide/model_quantization.rst

Docs/user_guide/post_training_quant_techniques.rst

Docs/user_guide/quant_analyzer.rst

Docs/user_guide/quantization_aware_training.rst

quic-mtuttle · 2024-09-23T16:52:52Z

Docs/user_guide/quantization_aware_training.rst

-than post-training quantization. However, the higher accuracy comes with the usual costs of neural
-network training, i.e. longer training times, need for labeled data and hyperparameter search.
+
+When post-training quantizatio (PTQ) doesn't sufficiently reduce quantization error, the next step is to use quantization-aware training (QAT). QAT finds more accurate solutions than PTQ by modeling the quantization noise during training. This higher accuracy comes at the usual cost of neural network training, including longer training times and the need for labeled data and hyperparameter search.


Suggestion:

Change: "QAT finds more accurate solutions than PTQ by modeling the quantization noise during training"

To: "QAT finds more optimal solutions than PTQ by finetuning the model parameters in the presence of quantization noise"

@quic-mtuttle I made this change, but used "better-optimized" rather than "more optimal". Question: What does "optimal" mean here if not "accurate"?

quic-mtuttle · 2024-09-23T21:29:15Z

Docs/user_guide/quantization_configuration.rst

-        When set to "False" or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use strict symmetric quantization.
+        Optional.  If included, value is "True" or "False".
+        "True" causes quantizers configured in symmetric mode to use strict symmetric quantization.
+       "False", or omitting the parameter, causes quantizers configured in symmetric mode to not use strict symmetric quantization.


Change in indent here messes with the formatting

quic-mtuttle · 2024-09-23T21:29:28Z

Docs/user_guide/quantization_configuration.rst

-        When set to "False" or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use unsigned symmetric quantization.
+        Optional.  If included, value is "True" or "False".
+        "True" causes quantizers configured in symmetric mode use unsigned symmetric quantization when available.
+       "False", or omitting the parameter, causes quantizers configured in symmetric mode to not use unsigned symmetric quantization.


Same comment on indentation

quic-mtuttle · 2024-09-23T21:30:49Z

Docs/user_guide/quantization_configuration.rst

-        When set to "False" or omitting the parameter altogether, parameter quantizers will use per tensor quantization.
+        Optional.  If included, value is "True" or "False".
+        "True" causes parameter quantizers to use per-channel quantization rather than per-tensor quantization.
+        When set to "False" or omitting the parameter, causes parameter quantizers to use per-tensor quantization.


Minor: Can make the phrasing here consistent with the other flags

quic-mtuttle · 2024-09-23T21:34:51Z

Docs/user_guide/quantization_configuration.rst

-            When set to "False", parameter quantizers of this op type will use per tensor quantization.
-            By omitting the setting, parameter quantizers of this op type will fall back to the setting specified by the defaults section.
+            Optional.  If included, value is "True" or "False".
+           "True" sets parameter quantizers of this op type to use per-channel quantization rather than per-tensor quantization.


Same comment on indentation

Docs/user_guide/adaround.rst

Docs/user_guide/bn_reestimation.rst

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

dwelsch-esi added 2 commits September 16, 2024 20:32

Edited Quantization User Guide.

5c2d094

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

Quantization Guide - more edits.

db6d7e8

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

quic-hitameht requested review from quic-mtuttle, quic-hitameht, quic-akhobare, quic-bharathr, quic-jpolizzi and quic-kyunggeu September 17, 2024 04:39

quic-hitameht reviewed Sep 17, 2024

View reviewed changes

Corrected TOC errors introduced by Quant UG edits.

4d9ff13

Signed-off-by: Dave Welsch <dwelsch@expertsupport.com>

quic-jpolizzi previously approved these changes Sep 20, 2024

View reviewed changes