Skip to content

Commit

Permalink
Merge branch 'main' into serve-api-docs-pr
Browse files Browse the repository at this point in the history
  • Loading branch information
josiahbryan authored Oct 9, 2024
2 parents f669333 + 184e414 commit d8b5711
Show file tree
Hide file tree
Showing 11 changed files with 58 additions and 22 deletions.
6 changes: 3 additions & 3 deletions container-images/cuda/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ RUN /usr/bin/python3 --version
RUN pip install "huggingface_hub[cli]==${HUGGINGFACE_HUB_VERSION}"
RUN pip install "omlmd==${OMLMD_VERSION}"

# Build wouldnt complete couldnt find libcuda so made a systemlink
# But this didnt work
# Build wouldn't complete couldn't find libcuda so made a systemlink
# But this didn't work

# RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1
# RUN LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs/
Expand All @@ -32,7 +32,7 @@ RUN pip install "omlmd==${OMLMD_VERSION}"

# ENV GGML_CCACHE=0

# Build wouldnt complete with cmake even with nvidia container toolkit installed
# Build wouldn't complete with cmake even with nvidia container toolkit installed

RUN git clone https://github.com/ggerganov/llama.cpp && \
cd llama.cpp && \
Expand Down
2 changes: 2 additions & 0 deletions docs/ramalama-containers.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ ramalama\-containers - list all RamaLama containers
## DESCRIPTION
List all containers running AI Models

Command will not work when run with --nocontainer option.

## OPTIONS

#### **--format**=*format*
Expand Down
4 changes: 4 additions & 0 deletions docs/ramalama-stop.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ ramalama\-stop - stop named container that is running AI Model
## SYNOPSIS
**ramalama stop** [*options*] *name*

Tells container engine to stop the specified container.

Command will not work when run with --nocontainer option.

## OPTIONS

#### **--all**, **-a**
Expand Down
12 changes: 8 additions & 4 deletions docs/ramalama.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,20 @@ ramalama - Simple management tool for working with AI Models
RamaLama : The goal of RamaLama is to make AI boring.

On first run RamaLama inspects your system for GPU support, falling back to CPU
support if no GPUs are present. It then uses container engines like Podman or
Docker to pull the appropriate OCI image with all of the software necessary to run an
AI Model for your systems setup. This eliminates the need for the user to
configure the system for AI themselves. After the initialization, RamaLama
support if no GPUs are present. RamaLama uses container engines like Podman or
Docker to pull the appropriate OCI image with all of the software necessary to
run an AI Model for your systems setup. This eliminates the need for the user
to configure the system for AI themselves. After the initialization, RamaLama
will run the AI Models within a container based on the OCI image.

RamaLama first pulls AI Models from model registries. It then start a chatbot
or a service as a rest API (using llama.cpp's server) from a simple single command.
Models are treated similarly to the way that Podman or Docker treat container images.

If you have both Podman and Docker installed, RamaLama defaults to Podman, use
the `RAMALAMA_CONTAINER_ENGINE=docker` environment variable to override this
behaviour.

RamaLama supports multiple AI model registries types called transports. Supported transports:


Expand Down
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ name = "ramalama"
version = "0.0.14"
dependencies = [
"argcomplete",
"tqdm",
"omlmd",
"huggingface_hub[cli]",
]
requires-python = ">= 3.8"
maintainers = [
Expand Down
26 changes: 14 additions & 12 deletions ramalama/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@
import os
import random
import string
import subprocess
import sys
import time

from ramalama.huggingface import Huggingface
from ramalama.common import in_container, container_manager, exec_cmd, run_cmd, default_image, find_working_directory
from ramalama.common import in_container, container_manager, exec_cmd, run_cmd, default_image, find_working_directory, perror
from ramalama.oci import OCI
from ramalama.ollama import Ollama
from ramalama.shortnames import Shortnames
Expand Down Expand Up @@ -217,11 +218,14 @@ def _list_containers(args):
if args.format:
conman_args += [f"--format={args.format}"]

output = run_cmd(conman_args).stdout.decode("utf-8").strip()
if output == "":
return []
return output.split("\n")

try:
output = run_cmd(conman_args).stdout.decode("utf-8").strip()
if output == "":
return []
return output.split("\n")
except subprocess.CalledProcessError as e:
perror("ramalama list command requires a running container engine")
raise(e)

def list_containers(args):
if len(_list_containers(args)) == 0:
Expand Down Expand Up @@ -382,11 +386,6 @@ def serve_cli(args):
model.serve(args)


def stop_cli(args):
model = New(args.MODEL)
model.stop(args)


def stop_parser(subparsers):
parser = subparsers.add_parser("stop", help="stop named container that is running AI Model")
parser.add_argument("--container", default=False, action="store_false", help=argparse.SUPPRESS)
Expand All @@ -405,7 +404,10 @@ def _stop_container(args, name):
if conman == "":
raise IndexError("no container manager (Podman, Docker) found")

conman_args = [conman, "stop", "-t=0", "--ignore=" + str(args.ignore), name]
conman_args = [conman, "stop", "-t=0"]
if args.ignore:
conman_args += [ "--ignore", str(args.ignore)]
conman_args += [name]
run_cmd(conman_args)


Expand Down
4 changes: 4 additions & 0 deletions ramalama/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ def in_container():


def container_manager():
engine = os.getenv("RAMALAMA_CONTAINER_ENGINE")
if engine:
return engine

if available("podman"):
return "podman"

Expand Down
4 changes: 2 additions & 2 deletions ramalama/huggingface.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
from ramalama.model import Model

missing_huggingface = """
Huggingface models requires the huggingface-cli and tldm modules.
Huggingface models requires the huggingface-cli and tqdm modules.
These modules can be installed via PyPi tools like pip, pip3, pipx or via
distribution package managers like dnf or apt. Example:
pip install huggingface_hub[cli] tldm
pip install huggingface_hub[cli] tqdm
"""


Expand Down
6 changes: 6 additions & 0 deletions ramalama/oci.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@ def __init__(self, model):
self.omlmd = f"{i}/../../../bin/omlmd"
if os.path.exists(self.omlmd):
break
raise """\
OCI models requires the omlmd module.
This module can be installed via PyPi tools like pip, pip3, pipx or via
distribution package managers like dnf or apt. Example:
pip install omlmd
"""

def login(self, args):
conman_args = [self.conman, "login"]
Expand Down
12 changes: 11 additions & 1 deletion ramalama/ollama.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,17 @@


def download_file(url, dest_path, headers=None):
from tqdm import tqdm
try:
from tqdm import tqdm
except FileNotFoundError:
raise NotImplementedError(
"""\
Ollama models requires the tqdm modules.
This model can be installed via PyPi tools like pip, pip3, pipx or via
distribution package managers like dnf or apt. Example:
pip install tqdm
"""
)

request = urllib.request.Request(url, headers=headers or {})
with urllib.request.urlopen(request) as response:
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ argcomplete
setuptools
wheel
omlmd
tqdm
huggingface_hub[cli]

0 comments on commit d8b5711

Please sign in to comment.