[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781

Xia-Weiwen · 2023-08-08T13:52:35Z

Stack from ghstack (oldest at bottom):

[Quant][Inductor] add UT of dequant promotion for linear #106935
[Quant][Inductor] Enable quantization linear pattern fusion inside inductor #106934
[Quant][Inductor] Enable qlinear weight prepack inside inductor constant folding #106782
-> [Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781
[Quant] Add int8 linear op impl for quantization PT2E with Inductor. input is an int8 CPU tensor; weight is an int8 MdkldnnCPU tensor. #105818

Summary
Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add quant-dequant pattern for linear and linear-unary post op.

Test plan
python test/test_quantization.py -k test_linear_with_quantizer_api
python test/test_quantization.py -k test_linear_unary_with_quantizer_api

… x86 inductor quantizer [ghstack-poisoned]

pytorch-bot · 2023-08-08T13:52:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106781

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8abd7f8 with merge base 808e088 ():

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

linux-focal-rocm5.6-py3.8 / test (default, 1, 3, linux.rocm.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… recipe for x86 inductor quantizer" [ghstack-poisoned]

… recipe for x86 inductor quantizer" **Summary** Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. **Test plan** python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api [ghstack-poisoned]

… x86 inductor quantizer ghstack-source-id: 9a10b20bc82f3a6e8f0249aa2453b06327ed427c Pull Request resolved: pytorch#106781

test/quantization/pt2e/test_x86inductor_quantizer.py

… recipe for x86 inductor quantizer" **Summary** Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. **Test plan** python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api [ghstack-poisoned]

leslie-fang-intel · 2023-08-24T11:17:45Z

@jerryzh168 Could you also kindly help to take a look of this PR?

jerryzh168 · 2023-08-24T17:13:29Z

test/quantization/pt2e/test_x86inductor_quantizer.py

@@ -400,3 +417,80 @@ def test_conv2d_serials_binary_unary_with_quantizer_api(self):
                node_occurrence,
                node_list,
            )
+
+    @skipIfNoX86
+    def test_linear_with_quantizer_api(self):


nit: you can remove _with_quantizer_api now

Thanks. It's removed.

jerryzh168 · 2023-08-24T17:13:58Z

test/quantization/pt2e/test_x86inductor_quantizer.py

+                )
+
+    @skipIfNoX86
+    def test_linear_unary_with_quantizer_api(self):


Thanks. It's removed.

jerryzh168 · 2023-08-24T17:14:34Z

test/quantization/pt2e/test_x86inductor_quantizer.py

+                node_list = [
+                    torch.ops.quantized_decomposed.quantize_per_tensor.default,
+                    torch.ops.quantized_decomposed.dequantize_per_tensor.default,
+                    torch.ops.aten.addmm.default if use_bias else torch.ops.aten.mm.default,


if you don't need to land this before branch cut, I think you could switch to capture_pre_autograd_graph for capture and you will see a single aten.linear op instead

Hi @jerryzh168. Thanks for the suggestion. We hope to land this before code freeze. We feel difficult to switch to new API as code freeze is near. Is it OK to land this first then we switch to new API later?

sure, that's fine

torch/ao/quantization/quantizer/x86_inductor_quantizer.py

… recipe for x86 inductor quantizer" **Summary** Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. **Test plan** python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api [ghstack-poisoned]

Xia-Weiwen · 2023-08-27T10:48:26Z

@pytorchbot merge

pytorchmergebot · 2023-08-27T10:50:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ant folding (#106782) **Summary** To realize weight prepack for quantized linear, we replace the following pattern ``` int8 activation | dequant_per_tensor | mm/addmm <- t <- dequant_per_channel <- int8_weight ``` with ``` int8 activation | onednn.qlinear_pointwise <- onednn.qlinear_prepack <- int8_weight ``` And we register weight prepack path inside inductor constant folding. Constant folding evaluates the prepack op and replace it with prepacked weight (a constant parameter) **Test plan** python test/inductor/test_mkldnn_pattern_matcher.py -k test_qlinear_unary Pull Request resolved: #106782 Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/eellison ghstack dependencies: #105818, #106781

…ductor (#106934) **Summary** Enable lowering of quantized linear in Inductor **Test plan** python test/inductor/test_mkldnn_pattern_matcher.py -k test_qlinear_unary Pull Request resolved: #106934 Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/eellison ghstack dependencies: #105818, #106781, #106782

**Summary** Previously the UT of dequant promotion in Inductor only tests convolution. Now add linear case in the UT. This is for quantization PT2E with Inductor. **Test plan** python test/inductor/test_mkldnn_pattern_matcher.py -k test_dequant_promotion Pull Request resolved: #106935 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #105818, #106781, #106782, #106934

[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for…

d9017ea

… x86 inductor quantizer [ghstack-poisoned]

Xia-Weiwen requested a review from jerryzh168 as a code owner August 8, 2023 13:52

pytorch-bot bot added the release notes: quantization release notes category label Aug 8, 2023

Xia-Weiwen marked this pull request as draft August 8, 2023 13:54

Xia-Weiwen requested review from leslie-fang-intel and jgong5 August 8, 2023 13:54

pytorchbot added the open source label Aug 8, 2023

This was referenced Aug 10, 2023

[Quant][Inductor] Enable quantization linear pattern fusion inside inductor #106934

Closed

[Quant][Inductor] add UT of dequant promotion for linear #106935

Closed

Xia-Weiwen added 4 commits August 10, 2023 11:12

Update on "[Quant][PT2E] Enable linear and linear-unary post-op quant…

edbb15a

… recipe for x86 inductor quantizer" [ghstack-poisoned]

Update on "[Quant][PT2E] Enable linear and linear-unary post-op quant…

440dfbd

… recipe for x86 inductor quantizer" [ghstack-poisoned]

leslie-fang-intel marked this pull request as ready for review August 18, 2023 08:46

leslie-fang-intel approved these changes Aug 18, 2023

View reviewed changes

jgong5 reviewed Aug 21, 2023

View reviewed changes

test/quantization/pt2e/test_x86inductor_quantizer.py Outdated Show resolved Hide resolved

Xia-Weiwen added 3 commits August 21, 2023 21:38

Xia-Weiwen requested a review from jgong5 August 22, 2023 05:33

jgong5 approved these changes Aug 22, 2023

View reviewed changes

jerryzh168 reviewed Aug 24, 2023

View reviewed changes

torch/ao/quantization/quantizer/x86_inductor_quantizer.py Show resolved Hide resolved

Xia-Weiwen added 2 commits August 25, 2023 09:45

jerryzh168 approved these changes Aug 25, 2023

View reviewed changes

leslie-fang-intel added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 26, 2023

Xia-Weiwen added 2 commits August 27, 2023 09:17

pytorchmergebot added the merging label Aug 27, 2023

pytorchmergebot added Merged and removed merging labels Aug 27, 2023

pytorchmergebot closed this in e9b0f62 Aug 27, 2023

facebook-github-bot deleted the gh/Xia-Weiwen/18/head branch August 30, 2023 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781

[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781

Xia-Weiwen commented Aug 8, 2023 •

edited

Loading

pytorch-bot bot commented Aug 8, 2023 •

edited

Loading

leslie-fang-intel commented Aug 24, 2023

jerryzh168 Aug 24, 2023

Xia-Weiwen Aug 25, 2023

jerryzh168 Aug 24, 2023

Xia-Weiwen Aug 25, 2023

jerryzh168 Aug 24, 2023

Xia-Weiwen Aug 25, 2023

jerryzh168 Aug 25, 2023

Xia-Weiwen commented Aug 27, 2023

pytorchmergebot commented Aug 27, 2023

[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781

[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer #106781

Conversation

Xia-Weiwen commented Aug 8, 2023 • edited Loading

pytorch-bot bot commented Aug 8, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106781

✅ You can merge normally! (1 Unrelated Failure)

leslie-fang-intel commented Aug 24, 2023

jerryzh168 Aug 24, 2023

Choose a reason for hiding this comment

Xia-Weiwen Aug 25, 2023

Choose a reason for hiding this comment

jerryzh168 Aug 24, 2023

Choose a reason for hiding this comment

Xia-Weiwen Aug 25, 2023

Choose a reason for hiding this comment

jerryzh168 Aug 24, 2023

Choose a reason for hiding this comment

Xia-Weiwen Aug 25, 2023

Choose a reason for hiding this comment

jerryzh168 Aug 25, 2023

Choose a reason for hiding this comment

Xia-Weiwen commented Aug 27, 2023

pytorchmergebot commented Aug 27, 2023

Merge started

Xia-Weiwen commented Aug 8, 2023 •

edited

Loading

pytorch-bot bot commented Aug 8, 2023 •

edited

Loading