No-messing-around sh client for llama.cpp's server
NOTE: Default config optimized for Llama-3.1
- sh
- curl
- jq
Add these scripts to your $PATH
. Configuration is done by editing the top of the llama.sh
file.
Not sure how to set up llama.cpp? Check out Mozilla's llamafile!
llama.sh [--options] "prompt"
echo "prompt" | llama.sh [--options]
echo "prompt" | llama.sh [--options] "system prompt"
Flags:
--n-predict N, -n N (number of tokens to generate, -1 for inf. default: -1)
--temp TEMP, -t TEMP (temperature. default: 0.6)
--min-p P, -m P (min-p. default: 0.05)
--top-p P, -p P (top-p. default: 0.95)
--top-k K, -k K (top-k. default: 45)
--stop "word", -s "word" (stop word. default: none)
--log logfile, -l logfile (set file for logging. default: ~/.cache/last_response.txt)
--verbose, -v (echo json payload before sending)
--raw, -r (do not wrap prompt with prefix/suffix strings)
--no-sys-prompt, -e (do not include system prompt string)
--api-key "key", -a "key" (override key used for llama.cpp API, usually not needed unless explicitly set. will override env var)
--api-url "url", -u "url" (override url used for llama.cpp API. will override env var)
--help, -h (display this message)
Environment Variables:
LSH_SYSTEM_PROMPT_PREFIX (string prefixed to system prompt input)
LSH_PREFIX (string prefixed to user prompt input)
LSH_SUFFIX (string appended to user prompt input)
LSH_API_KEY (optional API key)
LSH_API_URL (API url)