-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456
Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456
Conversation
2a29247
to
8db30bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work @jerryzh168 to enable serialization ! Really appreciate that you are doing the PRs on huggingface hub and transformers ! This looks pretty good ! I left a few comments
@SunMarc thanks for your thoughtful reviews! I have addressed all the comments I think, please take a look again, also not sure if the CI failure is relevant or not |
btw, current pytorch nightly has a perf regression: pytorch/ao#898 and we hope to fix this before 2.5 cherry-pick deadline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating! LGTM! Just a nit. Let me know when the regression is fixed. I've pinged a core maintainer to review the PR
To fix the failing test, can you rebase on main ? |
4f44f40
to
dc76271
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@ArthurZucker can you take a look at the PR? |
I've merged recently PR adding a new quantizer @jerryzh168. Sorry for that but could you rebase on main and update the |
f4d7c51
to
93f8c89
Compare
…nfig quantized model Summary: After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details Test Plan: tested locally Reviewers: Subscribers: Tasks: Tags:
93f8c89
to
b9543c8
Compare
@SunMarc @ArthurZucker updated, please take a look again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR, super sorry for the delay!
Super important to have serialization!
@property | ||
def is_serializable(self): | ||
return False | ||
def is_serializable(self, safe_serialization=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing the property is a tad breaking, so let's just put the 🔴 on the PR!
Thanks a lot @jerryzh168 🤗 great contributions and I love that we can upload serialized quantized weights to the hub now! |
Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers
Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details
Test Plan:
tested locally
https://gist.github.com/jerryzh168/965ccdbd595c9210d49cfbe31dc6705f
Reviewers:
Subscribers:
Tasks:
Tags: