[WIP] GPTQ MultiTensor refactor #716

danielpatrickhug · 2024-08-20T18:17:25Z

This is a PR in response to issue #577
The primary contributions of this PR are as follows:

Refactor the GPTQ module to use the MultiTensor class described here instead of the GenericGPTQRunner class which subclasses the fx.Interpreter.
Refactor the GPTQQuantizer class and the Int4WeightOnlyGPTQQuantizer class which previously depended on the GenericGPTQRunner class.

…uantizableModel class

pytorch-bot · 2024-08-20T18:17:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/716

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-08-21T19:35:35Z

@danielpatrickhug thanks for working on this, could you follow the flow described in #721 for the refactor? we want to deprecate the "Quantizer" based APIs

jerryzh168 · 2024-08-21T19:36:35Z

torchao/quantization/GPTQ_MT.py

+            total_batches = 0
+
+        outputs = []
+        with torch._C.DisableTorchFunctionSubclass():


this part can also be decoupled from the MultiTensor I feel

danielpatrickhug · 2024-08-22T16:17:00Z

@danielpatrickhug thanks for working on this, could you follow the flow described in #721 for the refactor? we want to deprecate the "Quantizer" based APIs

Sure no problem, I'll work on adapting it to this API format.

HDCharles · 2024-08-28T19:14:04Z

@danielpatrickhug thanks for working on this, could you follow the flow described in #721 for the refactor? we want to deprecate the "Quantizer" based APIs

i don't think we should combine a functional refactor with an API refactor. This should be a drop in replacement for what we had, otherwise we should make those API changes first and let OSS devs make their changes on top. I can do the API alignment after the functional refactor is complete.

…andled

* Handle compile for export and generate * typo * typo * typo

danielpatrickhug and others added 7 commits August 16, 2024 17:27

add inital [WIP] of MultiTensor rewrite of GPTQ

0f6f287

Implement GPTQQuantizer and Int4WeightOnlyGPTQQuantizer

fbcc555

Merge branch 'pytorch:main' into gptq_multitensor_refactor

d0b6de7

added GPTQuantizer and Int4WeightOnlyGPTQQuantizer classes. Removed Q…

4c4045a

…uantizableModel class

Merge branch 'pytorch:main' into gptq_multitensor_refactor

9e391f9

removed print statement

4951a58

Merge branch 'pytorch:main' into gptq_multitensor_refactor

ed2957f

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 20, 2024

danielpatrickhug marked this pull request as draft August 21, 2024 14:20

jerryzh168 reviewed Aug 21, 2024

View reviewed changes

msaroufim requested a review from HDCharles August 22, 2024 18:00

danielpatrickhug and others added 6 commits September 5, 2024 11:02

Merge branch 'pytorch:main' into gptq_multitensor_refactor

33cf48d

fix control structure in torch_function. layer outputs not properly h…

34c84d4

…andled

add testing script for gptq Multitensor

b2dae79

Merge branch 'pytorch:main' into gptq_multitensor_refactor

4f1338a

Merge branch 'pytorch:main' into gptq_multitensor_refactor

038df90

Merge branch 'pytorch:main' into gptq_multitensor_refactor

f26687d

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Handle MPS with for export and generate+compile (pytorch#716)

e503d7a

* Handle compile for export and generate * typo * typo * typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] GPTQ MultiTensor refactor #716

[WIP] GPTQ MultiTensor refactor #716

danielpatrickhug commented Aug 20, 2024

pytorch-bot bot commented Aug 20, 2024

jerryzh168 commented Aug 21, 2024

jerryzh168 Aug 21, 2024

danielpatrickhug commented Aug 22, 2024 •

edited

Loading

HDCharles commented Aug 28, 2024

[WIP] GPTQ MultiTensor refactor #716

Are you sure you want to change the base?

[WIP] GPTQ MultiTensor refactor #716

Conversation

danielpatrickhug commented Aug 20, 2024

pytorch-bot bot commented Aug 20, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/716

jerryzh168 commented Aug 21, 2024

jerryzh168 Aug 21, 2024

Choose a reason for hiding this comment

danielpatrickhug commented Aug 22, 2024 • edited Loading

HDCharles commented Aug 28, 2024

danielpatrickhug commented Aug 22, 2024 •

edited

Loading