Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerize cake #30

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Dockerize cake #30

wants to merge 6 commits into from

Conversation

derogab
Copy link

@derogab derogab commented Sep 16, 2024

News

  • Dockerize cake
  • Add docker image auto-build w/ GitHub Actions

New docker commands

Splitting the Model:

docker run --rm -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-split-model --model-path /data/Meta-Llama-3-8B \   # source model to split
                   --topology /data/topology.yml \        # topology file
                   --output /data/output-folder-name      # output folder

Run a worker node:

docker run --rm --network host -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-cli --model /data/Meta-Llama-3-8B \    # model path
           --mode worker \                    # run as worker
           --name worker0 \                   # worker name in topology file
           --topology /data/topology.yml \    # topology
           --address 0.0.0.0:10128            # bind address

Run a master node with an OpenAI compatible REST API:

docker run --rm --network host -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-cli --model /data/Meta-Llama-3-8B \    # model path
           --api 0.0.0.0:8080 \               # API bind address
           --topology /data/topology.yml      # topology file

@evilsocket
Copy link
Owner

@derogab thank you, this is cool! however in all the containers acceleration is disabled

@derogab
Copy link
Author

derogab commented Sep 17, 2024

Yes, right 😕

I have never had a hardware or applications worth activating it for (😅) and until now I had not bothered with it.
I always thought it was enough to add docker run --gpus all [...], but (after a brief search) I may be wrong.

I understand that you may need to map your hardware into the container at runtime, but this is not standard. Do you think it is enough to add this warning "in all the containers acceleration is disabled" in the README?

In the meantime, however, I will try to understand it better. Especially if changes to the Dockerfile are required.

@evilsocket
Copy link
Owner

Well it's a bit more complex than that. Since the code is compiled within the container, no acceleration is found and so no acceleration is compiled. I think the containers require something like this (at least the workers) https://sarus.readthedocs.io/en/stable/user/custom-cuda-images.html

The main point of this project is using acceleration (the inference is already slowed down by the fact it is distribute, without acceleration it is just unusable), so it doesn't make a lot of sense to have containers without it.

@derogab
Copy link
Author

derogab commented Sep 17, 2024

Got it, thank you! I'll try to see what I can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants