Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456

Merged
merged 10 commits into from
Sep 30, 2024

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Sep 13, 2024

Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details

Test Plan:
tested locally
https://gist.github.com/jerryzh168/965ccdbd595c9210d49cfbe31dc6705f

Reviewers:

Subscribers:

Tasks:

Tags:

@jerryzh168 jerryzh168 changed the title Enable non-safetensor serialization and deserialization for TorchAoCo… Enable non-safetensor ser/deser for TorchAoConfig quantized model Sep 13, 2024
@jerryzh168
Copy link
Contributor Author

cc @SunMarc @Wauplin can you take a look?

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @jerryzh168 to enable serialization ! Really appreciate that you are doing the PRs on huggingface hub and transformers ! This looks pretty good ! I left a few comments

docs/source/en/quantization/torchao.md Outdated Show resolved Hide resolved
docs/source/en/quantization/torchao.md Show resolved Hide resolved
src/transformers/modeling_utils.py Outdated Show resolved Hide resolved
src/transformers/modeling_utils.py Outdated Show resolved Hide resolved
src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved
src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved
src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved
@jerryzh168
Copy link
Contributor Author

@SunMarc thanks for your thoughtful reviews! I have addressed all the comments I think, please take a look again, also not sure if the CI failure is relevant or not

@jerryzh168
Copy link
Contributor Author

btw, current pytorch nightly has a perf regression: pytorch/ao#898 and we hope to fix this before 2.5 cherry-pick deadline

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! LGTM! Just a nit. Let me know when the regression is fixed. I've pinged a core maintainer to review the PR

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved
@SunMarc
Copy link
Member

SunMarc commented Sep 19, 2024

To fix the failing test, can you rebase on main ?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@jerryzh168
Copy link
Contributor Author

@ArthurZucker can you take a look at the PR?

@SunMarc
Copy link
Member

SunMarc commented Sep 25, 2024

I've merged recently PR adding a new quantizer @jerryzh168. Sorry for that but could you rebase on main and update the is_serializable method ?

…nfig quantized model

Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization
in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor
see README for more details

Test Plan:
tested locally

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168
Copy link
Contributor Author

@SunMarc @ArthurZucker updated, please take a look again

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, super sorry for the delay!
Super important to have serialization!

@property
def is_serializable(self):
return False
def is_serializable(self, safe_serialization=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing the property is a tad breaking, so let's just put the 🔴 on the PR!

@ArthurZucker ArthurZucker changed the title Enable non-safetensor ser/deser for TorchAoConfig quantized model Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 Sep 30, 2024
@ArthurZucker ArthurZucker merged commit 4bb49d4 into huggingface:main Sep 30, 2024
4 of 5 checks passed
@ArthurZucker
Copy link
Collaborator

Thanks a lot @jerryzh168 🤗 great contributions and I love that we can upload serialized quantized weights to the hub now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants