Skip to content

Commit

Permalink
convert-no-torch -> convert-legacy-llama
Browse files Browse the repository at this point in the history
  • Loading branch information
Galunid committed May 21, 2024
1 parent 85e4e2b commit 88e06d0
Show file tree
Hide file tree
Showing 7 changed files with 26 additions and 26 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -690,7 +690,7 @@ Building the program with BLAS support may lead to some performance improvements
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
Note: `convert.py` has been moved to `examples/convert-no-torch.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derievatives.
Note: `convert.py` has been moved to `examples/convert-legacy-llama.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derievatives.
It does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.

```bash
Expand Down
4 changes: 2 additions & 2 deletions ci/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ function gg_run_open_llama_3b_v2 {
(time cmake -DCMAKE_BUILD_TYPE=Release ${CMAKE_EXTRA} -DLLAMA_QKK_64=1 .. ) 2>&1 | tee -a $OUT/${ci}-cmake.log
(time make -j ) 2>&1 | tee -a $OUT/${ci}-make.log

python3 ../examples/convert-no-torch.py ${path_models}
python3 ../examples/convert-legacy-llama.py ${path_models}

model_f16="${path_models}/ggml-model-f16.gguf"
model_q8_0="${path_models}/ggml-model-q8_0.gguf"
Expand Down Expand Up @@ -417,7 +417,7 @@ function gg_run_open_llama_7b_v2 {
(time cmake -DCMAKE_BUILD_TYPE=Release ${CMAKE_EXTRA} -DLLAMA_CUDA=1 .. ) 2>&1 | tee -a $OUT/${ci}-cmake.log
(time make -j ) 2>&1 | tee -a $OUT/${ci}-make.log

python3 ../examples/convert-no-torch.py ${path_models}
python3 ../examples/convert-legacy-llama.py ${path_models}

model_f16="${path_models}/ggml-model-f16.gguf"
model_q8_0="${path_models}/ggml-model-q8_0.gguf"
Expand Down
2 changes: 1 addition & 1 deletion docs/HOWTO-add-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Also, it is important to check that the examples and main ggml backends (CUDA, M
### 1. Convert the model to GGUF

This step is done in python with a `convert` script using the [gguf](https://pypi.org/project/gguf/) library.
Depending on the model architecture, you can use either [convert-hf-to-gguf.py](../convert-hf-to-gguf.py) or [examples/convert-no-torch.py](../examples/convert-no-torch.py) (for `llama/llama2` models in `.pth` format).
Depending on the model architecture, you can use either [convert-hf-to-gguf.py](../convert-hf-to-gguf.py) or [examples/convert-legacy-llama.py](../examples/convert-legacy-llama.py) (for `llama/llama2` models in `.pth` format).

The convert script reads the model configuration, tokenizer, tensor names+data and converts them to GGUF metadata and tensors.

Expand Down
4 changes: 2 additions & 2 deletions examples/llava/MobileVLM-README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ python ./examples/llava/convert-image-encoder-to-gguf \
--projector-type ldpv2
```

4. Use `examples/convert-no-torch.py` to convert the LLaMA part of LLaVA to GGUF:
4. Use `examples/convert-legacy-llama.py` to convert the LLaMA part of LLaVA to GGUF:

```sh
python ./examples/convert-no-torch.py path/to/MobileVLM-1.7B
python ./examples/convert-legacy-llama.py path/to/MobileVLM-1.7B
```

5. Use `quantize` to convert LLaMA part's DataType from `fp16` to `q4_k`
Expand Down
6 changes: 3 additions & 3 deletions examples/llava/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ python ./examples/llava/llava-surgery.py -m ../llava-v1.5-7b
python ./examples/llava/convert-image-encoder-to-gguf.py -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b
```

5. Use `examples/convert-no-torch.py` to convert the LLaMA part of LLaVA to GGUF:
5. Use `examples/convert-legacy-llama.py` to convert the LLaMA part of LLaVA to GGUF:

```sh
python ./examples/convert-no-torch.py ../llava-v1.5-7b --skip-unknown
python ./examples/convert-legacy-llama.py ../llava-v1.5-7b --skip-unknown
```

Now both the LLaMA part and the image encoder are in the `llava-v1.5-7b` directory.
Expand Down Expand Up @@ -92,7 +92,7 @@ python ./examples/llava/convert-image-encoder-to-gguf.py -m vit --llava-projecto

6) Then convert the model to gguf format:
```console
python ./examples/convert-no-torch.py ../llava-v1.6-vicuna-7b/ --skip-unknown
python ./examples/convert-legacy-llama.py ../llava-v1.6-vicuna-7b/ --skip-unknown
```

7) And finally we can run the llava-cli using the 1.6 model version:
Expand Down
20 changes: 10 additions & 10 deletions scripts/convert-gg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@
set -e

# LLaMA v1
python3 examples/convert-no-torch.py ../llama1/7B --outfile models/llama-7b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../llama1/13B --outfile models/llama-13b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../llama1/30B --outfile models/llama-30b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../llama1/65B --outfile models/llama-65b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama1/7B --outfile models/llama-7b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama1/13B --outfile models/llama-13b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama1/30B --outfile models/llama-30b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama1/65B --outfile models/llama-65b/ggml-model-f16.gguf --outtype f16

# LLaMA v2
python3 examples/convert-no-torch.py ../llama2/llama-2-7b --outfile models/llama-7b-v2/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../llama2/llama-2-13b --outfile models/llama-13b-v2/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../llama2/llama-2-70b --outfile models/llama-70b-v2/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama2/llama-2-7b --outfile models/llama-7b-v2/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama2/llama-2-13b --outfile models/llama-13b-v2/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../llama2/llama-2-70b --outfile models/llama-70b-v2/ggml-model-f16.gguf --outtype f16

# Code Llama
python3 examples/convert-no-torch.py ../codellama/CodeLlama-7b/ --outfile models/codellama-7b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../codellama/CodeLlama-13b/ --outfile models/codellama-13b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-no-torch.py ../codellama/CodeLlama-34b/ --outfile models/codellama-34b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../codellama/CodeLlama-7b/ --outfile models/codellama-7b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../codellama/CodeLlama-13b/ --outfile models/codellama-13b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ../codellama/CodeLlama-34b/ --outfile models/codellama-34b/ggml-model-f16.gguf --outtype f16

# Falcon
python3 convert-falcon-hf-to-gguf.py ../falcon/falcon-7b 1
Expand Down
14 changes: 7 additions & 7 deletions scripts/pod-llama.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ if [ "$1" -eq "1" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/tinyllama-1b --outfile ./models/tinyllama-1b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/tinyllama-1b --outfile ./models/tinyllama-1b/ggml-model-f16.gguf --outtype f16

./quantize ./models/tinyllama-1b/ggml-model-f16.gguf ./models/tinyllama-1b/ggml-model-q4_0.gguf q4_0
./quantize ./models/tinyllama-1b/ggml-model-f16.gguf ./models/tinyllama-1b/ggml-model-q4_k.gguf q4_k
Expand All @@ -90,7 +90,7 @@ if [ "$1" -eq "2" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-7b --outfile ./models/codellama-7b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-7b --outfile ./models/codellama-7b/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-7b/ggml-model-f16.gguf ./models/codellama-7b/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-7b/ggml-model-f16.gguf ./models/codellama-7b/ggml-model-q4_k.gguf q4_k
Expand All @@ -105,7 +105,7 @@ if [ "$1" -eq "3" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-13b --outfile ./models/codellama-13b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-13b --outfile ./models/codellama-13b/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-13b/ggml-model-f16.gguf ./models/codellama-13b/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-13b/ggml-model-f16.gguf ./models/codellama-13b/ggml-model-q4_k.gguf q4_k
Expand All @@ -120,7 +120,7 @@ if [ "$1" -eq "4" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-34b --outfile ./models/codellama-34b/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-34b --outfile ./models/codellama-34b/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-34b/ggml-model-f16.gguf ./models/codellama-34b/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-34b/ggml-model-f16.gguf ./models/codellama-34b/ggml-model-q4_k.gguf q4_k
Expand All @@ -135,7 +135,7 @@ if [ "$1" -eq "5" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-7b-instruct --outfile ./models/codellama-7b-instruct/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-7b-instruct --outfile ./models/codellama-7b-instruct/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-7b-instruct/ggml-model-f16.gguf ./models/codellama-7b-instruct/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-7b-instruct/ggml-model-f16.gguf ./models/codellama-7b-instruct/ggml-model-q4_k.gguf q4_k
Expand All @@ -150,7 +150,7 @@ if [ "$1" -eq "6" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-13b-instruct --outfile ./models/codellama-13b-instruct/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-13b-instruct --outfile ./models/codellama-13b-instruct/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-13b-instruct/ggml-model-f16.gguf ./models/codellama-13b-instruct/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-13b-instruct/ggml-model-f16.gguf ./models/codellama-13b-instruct/ggml-model-q4_k.gguf q4_k
Expand All @@ -165,7 +165,7 @@ if [ "$1" -eq "7" ]; then

cd /workspace/llama.cpp

python3 examples/convert-no-torch.py ./models/codellama-34b-instruct --outfile ./models/codellama-34b-instruct/ggml-model-f16.gguf --outtype f16
python3 examples/convert-legacy-llama.py ./models/codellama-34b-instruct --outfile ./models/codellama-34b-instruct/ggml-model-f16.gguf --outtype f16

./quantize ./models/codellama-34b-instruct/ggml-model-f16.gguf ./models/codellama-34b-instruct/ggml-model-q4_0.gguf q4_0
./quantize ./models/codellama-34b-instruct/ggml-model-f16.gguf ./models/codellama-34b-instruct/ggml-model-q4_k.gguf q4_k
Expand Down

0 comments on commit 88e06d0

Please sign in to comment.