podman-llm

The goal of podman-llm is to make AI even more boring.

Install

Install podman-llm by running this one-liner:

curl -fsSL https://raw.githubusercontent.com/ericcurtin/podman-llm/s/install.sh | sudo bash

Usage

Running Models

You can run a model using the run command. This will start an interactive session where you can query the model.

$ podman-llm run granite
> Tell me about podman in less than ten words
A fast, secure, and private container engine for modern applications.
>

Serving Models

To serve a model via HTTP, use the serve command. This will start an HTTP server that listens for incoming requests to interact with the model.

$ podman-llm serve granite
...
{"tid":"140477699799168","timestamp":1719579518,"level":"INFO","function":"main","line":3793,"msg":"HTTP server listening","n_threads_http":"11","port":"8080","hostname":"127.0.0.1"}
...

Model library

Model	Parameters	Run
granite	3B	`podman-llm run granite`
mistral	7B	`podman-llm run mistral`
merlinite	7B	`podman-llm run merlinite`

Containerfile Example

Here is an example Containerfile:

FROM quay.io/podman-llm/podman-llm:41
RUN llama-main --hf-repo ibm-granite/granite-3b-code-instruct-GGUF -m granite-3b-code-instruct.Q4_K_M.gguf
LABEL MODEL=/granite-3b-code-instruct.Q4_K_M.gguf

LABEL MODEL is important so we know where to find the .gguf file.

And we build via:

podman-llm build granite

Diagram

+----------------+
|                |
| podman-llm run |
|                |
+-------+--------+
        |
        v
+----------------+    +-----------------------+    +------------------+
|                |    | Pull runtime layer    |    | Pull model layer |
| Auto-detect    +--->| for llama.cpp         +--->| i.e. granite     |
| hardware type  |    | (CPU, Vulkan, AMD,    |    |                  |
|                |    |  Nvidia, Intel,       |    +------------------+
+----------------+    |  Apple Silicon, etc.) |    | Repo options:    |
                      +-----------------------+    +-+-------+------+-+
                                                     |       |      |
                                                     v       v      v
                                             +---------+ +------+ +----------+
                                             | Hugging | | quay | | Ollama   |
                                             | Face    | |      | | Registry |
                                             +-------+-+ +---+--+ +-+--------+
                                                     |       |      |
                                                     v       v      v
                                                   +------------------+
                                                   | Start container  |
                                                   | with llama.cpp   |
                                                   | and granite      |
                                                   | model            |
                                                   +------------------+

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
.github/workflows		.github/workflows
container-images/podman-llm		container-images/podman-llm
hf-db		hf-db
.clang-format		.clang-format
LICENSE		LICENSE
README.md		README.md
ci.sh		ci.sh
install.sh		install.sh
podman-build.sh		podman-build.sh
podman-llm		podman-llm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

podman-llm

Install

Usage

Running Models

Serving Models

Model library

Containerfile Example

Diagram

About

Releases

Packages

Contributors 2

Languages

License

ericcurtin/podman-llm

Folders and files

Latest commit

History

Repository files navigation

podman-llm

Install

Usage

Running Models

Serving Models

Model library

Containerfile Example

Diagram

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages