diff --git a/docs/android.md b/docs/android.md index 320b62240382f..47530c6c1d478 100644 --- a/docs/android.md +++ b/docs/android.md @@ -23,10 +23,10 @@ $ curl -L {model-url} -o ~/{model}.gguf Then, if you are not already in the repo directory, `cd` into `llama.cpp` and: ``` -$ ./build/bin/llama-simple -m ~/{model}.gguf -c {context-size} -p "{your-prompt}" +$ ./build/bin/llama-cli -m ~/{model}.gguf -c {context-size} -p "{your-prompt}" ``` -Here, we show `llama-simple`, but any of the executables under `examples` should work, in theory. Be sure to set `context-size` to a reasonable number (say, 4096) to start with; otherwise, memory could spike and kill your terminal. +Here, we show `llama-cli`, but any of the executables under `examples` should work, in theory. Be sure to set `context-size` to a reasonable number (say, 4096) to start with; otherwise, memory could spike and kill your terminal. To see what it might look like visually, here's an old demo of an interactive session running on a Pixel 5 phone: diff --git a/examples/simple/README.md b/examples/simple/README.md index 0ff3425359a41..937008b243ee4 100644 --- a/examples/simple/README.md +++ b/examples/simple/README.md @@ -3,7 +3,7 @@ The purpose of this example is to demonstrate a minimal usage of llama.cpp for generating text with a given prompt. ```bash -./llama-simple -m ./models/llama-7b-v2/ggml-model-f16.gguf -p "Hello my name is" +./llama-simple -m ./models/llama-7b-v2/ggml-model-f16.gguf "Hello my name is" ...