You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been merging lora into quantized models for a while now with export_lora and have had good results. The models definitely merge and performance appears to improve. Converting the lora to GGUF and then applying it to models results in a working model.
The same can't be said for falcon. All falcon tunes are released as PEFT and the model is simply too large to d/l as FP16. It's several hundred GB unless quantized.
I applied the PR #3333 and am able to successfully convert lora to GGUF. I can then use export_lora to merge. However the models come out repeating gibberish and having sentence piece errors when used with HF sampling.
Looking over the code, there is nothing llama specific that I can find in it. Has anyone been able to load a lora to any falcon models, either live or as merges? Anyone have ideas of what's wrong?
The text was updated successfully, but these errors were encountered:
I've been merging lora into quantized models for a while now with export_lora and have had good results. The models definitely merge and performance appears to improve. Converting the lora to GGUF and then applying it to models results in a working model.
The same can't be said for falcon. All falcon tunes are released as PEFT and the model is simply too large to d/l as FP16. It's several hundred GB unless quantized.
I applied the PR #3333 and am able to successfully convert lora to GGUF. I can then use export_lora to merge. However the models come out repeating gibberish and having sentence piece errors when used with HF sampling.
Looking over the code, there is nothing llama specific that I can find in it. Has anyone been able to load a lora to any falcon models, either live or as merges? Anyone have ideas of what's wrong?
The text was updated successfully, but these errors were encountered: