-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert pytorch-based models to work with llama.cpp? #707
Comments
I think you can use the same approach as what I posted in #708 |
There's an old convert.py script that does this. The combination of this, with the more recent conversion scripts to the new GGML formats makes it work fine. I don't know why the older convert.py script was nixed. |
@IngwiePhoenix this should work for you. Ping me if you have trouble. After using this you'll need to migrate to the new ggml format. For some reason, the existing pth->ggml converter only accepts the base consolidated.00.pth format. This accepts .pt and .pth and hf pytorch formatted models. I'm not sure what normally generates the params.json file, but I included one as an example (for llama 13B) in the gist. I think the only thing that changes is the I can't find where this file originally came from, but ty to the person that made it :) |
Hi, thanks for your code. I have the following problem when I try to use the code, not sure if I am correct. I first use this code to convert my pth to bin files. link and then I use python3 convert.py --vocab-dir models/tokenizer.model --outfile models/lora-alpaca --model models/hf_ckpt
|
Getting fiver internet installed and setting up still. Will run this code later, but looks promising! The parameters you were talking about are - as far as my novice understanding goes - part of the model itself. Like... part of the Thanks for the links, too! |
@MillionthOdin16 fyi I think the convert.py comes from here #545 (I think the plan is still to include into the main) |
Okay great! Yeah I still use that convert script a ton. Then I just convert it to the new ggml format. |
I would like to request that all the various conversion scripts, formats, and files needed to convert models be documented in a single location so that we can refer people to that location. Ideally the document would also include a flow chart for people to follow so they can get from whatever variant they're starting with to a format that can be loaded with llama-cpp. If this document does not exist, I would be happy to help write it. I don't know the technical specifics of llama-cpp, pytorch or even ML, but I am a programmer by profession who has worked with a lot of low level binary formats and protocols in the past and have read a lot of rfcs and proprietary specifications in order to do so. I just need guidance to learn what is necessary to understand how all the various formats and scripts fit together. Is writing such a document doable? Is this be something the developers of this project find useful and would be interested in supporting? Would it make more sense for me to open this as a new issue? At the very least, such a document should severely cut down the number of issues posted about this topic. |
Even better, combine all the scripts into one, PR in final stages: #545 |
Well I have tried ever suggestion in this issue and nothing is working for the Salesforce-CodeGen-16B-multi model on Huggingface. It is downloaded as a 32GB pytorch_model.bin file and none of the convert scripts work. COMMAND ERROR
|
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Out of curiosity, I want to see if I can launch a very mini AI on my little network server. It usually has around 3GB of free memory, and it'd be nice to chat with it sometimes. For that, I'd like to try a smaller model like Pythia.
So I would like to know:
pytorch_model*.bin
to ggjm?I looked at the existing
convert_*.py
scripts, but none of those seemed to be for this type of model.Thanks in advance!
The text was updated successfully, but these errors were encountered: