-
Notifications
You must be signed in to change notification settings - Fork 26.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs / Quantization: refactor quantization documentation #30942
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @younesbelkada. Thanks for refactoring the docs, so that the users can choose better which quantization method to use !
Quantization techniques focus on representing data with less information while also trying to not lose too much accuracy. This often means converting a data type to represent the same information with fewer bits. For example, if your model weights are stored as 32-bit floating points and they're quantized to 16-bit floating points, this halves the model size which makes it easier to store and reduces memory-usage. Lower precision can also speedup inference because it takes less time to perform calculations with fewer bits. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those who are interested in learning more about quantization, do you think we can put the links to the DLAI course ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! 🔥
Makes a lot of sense to create separate pages for each method especially if the community keeps adding new quantization methods!
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* refactor quant docs * delete file * rename to overview * fix * fix table * fix * add content * fix library versions * fix table * fix table * fix table * fix table * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * replace to quantization_config * fix aqlm snippet * add DLAI courses * fix * fix table * fix bulet points --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
In this PR you deleted the You should update the following file to ensure that users coming from other places don't get redirected to an empty/outdated file: https://github.com/huggingface/transformers/blob/main/docs/source/en/_redirects.yml |
We likely want to redirect to the overview.md file |
Thanks for the heads up ! Done in #31063 |
…#30942) * refactor quant docs * delete file * rename to overview * fix * fix table * fix * add content * fix library versions * fix table * fix table * fix table * fix table * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * replace to quantization_config * fix aqlm snippet * add DLAI courses * fix * fix table * fix bulet points --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
What does this PR do?
As per title, this PR refactors the quantization documentation to make it clearer, less aggressive to users and simple to understand, mainly about which quantization method to use when - still WIP
cc @SunMarc @stevhliu @Titus-von-Koeller