-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Activation Aware Weight Quantization (AWQ) #743
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/743
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit e7e329b with merge base 09b8b3c (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
return insert_subclass | ||
|
||
|
||
def awq_uintx(quant_dtype: torch.dtype = torch.uint4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just remove weight_quant_fn
and add use_hqq
to the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also when it's uint4, I think it's fine to just use TensorCoreTiledLayout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that as a feature so maybe people can find ways to compose on top of AWQ esp if they come up with new kernels or what not, but I would agree that its not a necessary feature for an initial release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks for addressing all the comments! I think the main thing remaining is just to use use_hqq
flag for awq_uintx
so we don't take a random weight_quant_fn
also make sure to fix the CI issues as well
Integrate AWQ within the TorchAO framework
Integrate AWQ within the TorchAO framework
this shouldn't be in generate.py, it should be in eval so we can actually see the accuracy impact |
Adds AWQ per #530
To do: