Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purpose of WithQuantization() #31

Open
budgetdevv opened this issue Sep 15, 2024 · 2 comments
Open

Purpose of WithQuantization() #31

budgetdevv opened this issue Sep 15, 2024 · 2 comments

Comments

@budgetdevv
Copy link

What is the purpose of this API? Do I need to use it when running a quantized GGUF model? Thanks

@DarthAffe
Copy link
Owner

Hey, the short anwser is: There is no real use to this, don't use it.

The longer version of that is, that this method loads the weights of the model quantized in the given format. (It quantizes on the fly when loading without it beeing saved in this format - this was required in the beginning of stable-diffusion.cpp, when saving quantized models wasn't possible). But due to the long time needed to load a model with this setting, I don't really see any use case where this is better than just converting it before.

@budgetdevv
Copy link
Author

Hey thanks for the prompt response! I was wondering why it took so long

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants