Dockerize cake #30

derogab · 2024-09-16T20:55:42Z

News

Dockerize cake
Add docker image auto-build w/ GitHub Actions

New docker commands

Splitting the Model:

docker run --rm -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-split-model --model-path /data/Meta-Llama-3-8B \   # source model to split
                   --topology /data/topology.yml \        # topology file
                   --output /data/output-folder-name      # output folder

Run a worker node:

docker run --rm --network host -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-cli --model /data/Meta-Llama-3-8B \    # model path
           --mode worker \                    # run as worker
           --name worker0 \                   # worker name in topology file
           --topology /data/topology.yml \    # topology
           --address 0.0.0.0:10128            # bind address

Run a master node with an OpenAI compatible REST API:

docker run --rm --network host -v /path/to/data:/data ghcr.io/evilsocket/cake \
  cake-cli --model /data/Meta-Llama-3-8B \    # model path
           --api 0.0.0.0:8080 \               # API bind address
           --topology /data/topology.yml      # topology file

evilsocket · 2024-09-17T08:23:40Z

@derogab thank you, this is cool! however in all the containers acceleration is disabled

derogab · 2024-09-17T09:42:47Z

Yes, right 😕

I have never had a hardware or applications worth activating it for (😅) and until now I had not bothered with it.
I always thought it was enough to add docker run --gpus all [...], but (after a brief search) I may be wrong.

I understand that you may need to map your hardware into the container at runtime, but this is not standard. Do you think it is enough to add this warning "in all the containers acceleration is disabled" in the README?

In the meantime, however, I will try to understand it better. Especially if changes to the Dockerfile are required.

evilsocket · 2024-09-17T09:52:01Z

Well it's a bit more complex than that. Since the code is compiled within the container, no acceleration is found and so no acceleration is compiled. I think the containers require something like this (at least the workers) https://sarus.readthedocs.io/en/stable/user/custom-cuda-images.html

The main point of this project is using acceleration (the inference is already slowed down by the fact it is distribute, without acceleration it is just unusable), so it doesn't make a lot of sense to have containers without it.

derogab · 2024-09-17T12:33:06Z

Got it, thank you! I'll try to see what I can do.

derogab added 6 commits September 15, 2024 23:29

chore: add dockerfile

c824f6e

chore: copy .gitignore to .dockerignore

2474c46

chore: auto-build docker image w/ github actions

39cb78e

fix: use docker cmd to allow also cake-split-model

f8bd3e0

docs: update README.md w/ new docker commands

11febec

docs: use network host to make setup easier

a4ac80d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerize cake #30

Dockerize cake #30

derogab commented Sep 16, 2024

evilsocket commented Sep 17, 2024

derogab commented Sep 17, 2024

evilsocket commented Sep 17, 2024

derogab commented Sep 17, 2024

Dockerize cake #30

Are you sure you want to change the base?

Dockerize cake #30

Conversation

derogab commented Sep 16, 2024

News

New docker commands

evilsocket commented Sep 17, 2024

derogab commented Sep 17, 2024

evilsocket commented Sep 17, 2024

derogab commented Sep 17, 2024