#
llama
Here are 14 public repositories matching this topic...
Fast LLaMa inference on CPU using llama.cpp for Python
-
Updated
Mar 23, 2023 - C
llama.cpp Desktop Client Demo
-
Updated
Apr 22, 2023 - C
V-lang api wrapper for llm-inference chatllm.cpp
inference
bindings
api-wrapper
llama
gemma
mistral
int8
int8-inference
v-lang
vlang
int8-quantization
cpu-inference
llm
llms
chatllm
ggml
llm-inference
qwen
-
Updated
Nov 20, 2024 - C
Nim api-wrapper for llm-inference chatllm.cpp
nim
inference
bindings
api-wrapper
llama
nim-language
gemma
nim-lang
mistral
int8
int8-inference
int8-quantization
cpu-inference
llm
llms
chatllm
ggml
llm-inference
qwen
-
Updated
Nov 20, 2024 - C
C++ Implementation of Meta's LLaMA v2 Engine. Credited to ggerganov/llama.cpp
-
Updated
Oct 11, 2023 - C
Improve this page
Add a description, image, and links to the llama topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llama topic, visit your repo's landing page and select "manage topics."