-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] GPTQ MultiTensor refactor #716
base: main
Are you sure you want to change the base?
[WIP] GPTQ MultiTensor refactor #716
Conversation
…uantizableModel class
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/716
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@danielpatrickhug thanks for working on this, could you follow the flow described in #721 for the refactor? we want to deprecate the "Quantizer" based APIs |
total_batches = 0 | ||
|
||
outputs = [] | ||
with torch._C.DisableTorchFunctionSubclass(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part can also be decoupled from the MultiTensor I feel
Sure no problem, I'll work on adapting it to this API format. |
i don't think we should combine a functional refactor with an API refactor. This should be a drop in replacement for what we had, otherwise we should make those API changes first and let OSS devs make their changes on top. I can do the API alignment after the functional refactor is complete. |
* Handle compile for export and generate * typo * typo * typo
This is a PR in response to issue #577
The primary contributions of this PR are as follows: