You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, the short anwser is: There is no real use to this, don't use it.
The longer version of that is, that this method loads the weights of the model quantized in the given format. (It quantizes on the fly when loading without it beeing saved in this format - this was required in the beginning of stable-diffusion.cpp, when saving quantized models wasn't possible). But due to the long time needed to load a model with this setting, I don't really see any use case where this is better than just converting it before.
What is the purpose of this API? Do I need to use it when running a quantized GGUF model? Thanks
The text was updated successfully, but these errors were encountered: