Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLI] simplify docker run #159

Merged
merged 12 commits into from
Sep 30, 2024
Merged

[CLI] simplify docker run #159

merged 12 commits into from
Sep 30, 2024

Conversation

yanxi0830
Copy link
Contributor

@yanxi0830 yanxi0830 commented Sep 30, 2024

Changes

  • Motivation: We should not need to install llama CLI & run llama stack configure / llama stack run outside of docker containers. Downloading docker image should be sufficient to start Llama Stack server.
  • Clean up output messages from CLI.
  • [RFC] New developer flow for interaction with docker image.

Developer Flow

  1. Download docker image from docker hub.
docker image pull llamastack/llamastack-local-gpu
  1. [New] Run w/ built in default config
docker run -it -p 5000:5000 -v ~/.llama:/root/.llama --gpus=all llamastack-local-gpu
  1. (Advanced Option) Run with custom config docker
docker run -it \
-p 5000:5000 \
-v path/to/run.yaml:/app/run.yaml \
-v ~/.llama:/root/.llama \
--gpus=all \
llamastack-d1 \
/app/run.yaml
--port 5000

where path/to/run.yaml is absolute path to config outside container. /app/run.yaml is the mounted path to config inside container.

3.5 (Easier configuration) add example build.yaml / run.yaml configs.

  1. Old llama configure/run flow outside docker container still works.
$ llama stack build
> Enter a name for your Llama Stack (e.g. my-local-stack): d1
...

$ llama stack configure llamastack-d1

$ llama stack run d1

Distribution Owner: Building Docker

$ llama stack build

> Enter a name for your Llama Stack (e.g. my-local-stack): d7
> Enter the image type you want your Llama Stack to be built as (docker or conda): docker

 Llama Stack is composed of several APIs working together. Let's configure the providers (implementations) you want to use for these APIs.
> Enter provider for the inference API: (default=meta-reference): meta-reference
> Enter provider for the safety API: (default=meta-reference): meta-reference
> Enter provider for the agents API: (default=meta-reference): meta-reference
> Enter provider for the memory API: (default=meta-reference): meta-reference
> Enter provider for the telemetry API: (default=meta-reference): meta-reference
 
 > (Optional) Enter a short description for your Llama Stack:
Build spec configuration saved at /data/users/xiyan/llama-stack/tmp/configs/d7-build.yaml
Configuring API `inference`...
=== Configuring provider `meta-reference` for API inference...
Enter value for model (default: Llama3.1-8B-Instruct) (required): 
Do you want to configure quantization? (y/n): n
Enter value for torch_seed (optional): 
Enter value for max_seq_len (default: 4096) (required): 
Enter value for max_batch_size (default: 1) (required): 

Configuring API `safety`...
=== Configuring provider `meta-reference` for API safety...
Do you want to configure llama_guard_shield? (y/n): n
Do you want to configure prompt_guard_shield? (y/n): n

Configuring API `agents`...
=== Configuring provider `meta-reference` for API agents...
Enter `type` for persistence_store (options: redis, sqlite, postgres) (default: sqlite): 

Configuring SqliteKVStoreConfig:
Enter value for namespace (optional): 
Enter value for db_path (default: /home/xiyan/.llama/runtime/kvstore.db) (required): 

Configuring API `memory`...
=== Configuring provider `meta-reference` for API memory...
> Please enter the supported memory bank type your provider has for memory: vect
or

Configuring API `telemetry`...
=== Configuring provider `meta-reference` for API telemetry...

> YAML configuration has been written to `/data/users/xiyan/llama-stack/tmp/configs/d7-run.yaml`.
Dockerfile created successfully in /tmp/tmp.4Mfy6zpfb2/DockerfileFROM python:3.10-slim
WORKDIR /app
...

...
Success! You can run it with: podman run -p 8000:8000 llamastack-d7

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 30, 2024
@yanxi0830 yanxi0830 marked this pull request as ready for review September 30, 2024 16:07
Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

llama_stack/distribution/build.py Show resolved Hide resolved
llama_stack/distribution/configure_container.sh Outdated Show resolved Hide resolved
@yanxi0830 yanxi0830 merged commit d28c3df into main Sep 30, 2024
3 checks passed
@yanxi0830 yanxi0830 deleted the simplify_docker_2 branch September 30, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants