Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remove model weights from container images #786

Merged
merged 18 commits into from
Aug 1, 2024

Conversation

YrrepNoj
Copy link
Member

@YrrepNoj YrrepNoj commented Jul 16, 2024

This PR is a proof of concept to how we can utilize Zarf Data Injections to push model weights to a PVC instead of directly building them into the Docker images themselves.

Pros:

  • Much smaller image size (but similar Zarf Package sizes; with the exception of whisper which is now smaller as well)
Model Container Original Size New Size
llama-cpp-python 4.58 GB 0.21 GB
whisper 6 GB 5.86 GB
text-embeddings 10.4 GB 5.63 GB
vLLM 14.2 GB 9.62
  • Faster build & deploy time
    • I did not quantify this bullet by testing the timings locally but I noticed significant speedups as I was performing other tests locally.
    • As a data-point, the e2e tests from this branch are running the model creation & deployment steps an average of 2 minutes faster than the tests branching from main. This is about a 30% speedup for this workflow step.



Cons:

  • Adds a tiny bit of complexity to our deployment by introducing a new PVC. This seems pretty reasonable, but it is still worth noting that this is an added thing that delivery engineers would have to manage (for the storage class of the PVC).
  • Adds a dependency on Python (and the dev optional-dependencies listed in the pyproject.toml being installed on the host machine during the zarf package create lifecycle.
    • This dependency is now introduced because we are now pulling down the model weights during the Zarf package creation time instead of pulling the model weights during the Docker container build time.

@YrrepNoj YrrepNoj requested a review from a team as a code owner July 16, 2024 21:21
@YrrepNoj YrrepNoj marked this pull request as draft July 26, 2024 15:00
@YrrepNoj YrrepNoj added this to the EVERGREEN milestone Jul 26, 2024
@YrrepNoj YrrepNoj changed the base branch from 623-ADR-model-directory to main July 26, 2024 15:20
@YrrepNoj YrrepNoj dismissed CollectiveUnicorn’s stale review July 26, 2024 15:20

The base branch was changed.

Copy link

netlify bot commented Jul 26, 2024

Deploy Preview for leapfrogai-docs ready!

Name Link
🔨 Latest commit 742cf4c
🔍 Latest deploy log https://app.netlify.com/sites/leapfrogai-docs/deploys/66aab5d55d38f80008e2bccd
😎 Deploy Preview https://deploy-preview-786--leapfrogai-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 39 (🟢 up 2 from production)
Accessibility: 98 (no change from production)
Best Practices: 100 (no change from production)
SEO: 92 (no change from production)
PWA: -
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

@YrrepNoj YrrepNoj modified the milestone: EVERGREEN Jul 26, 2024
@YrrepNoj YrrepNoj marked this pull request as ready for review July 26, 2024 15:26
@YrrepNoj YrrepNoj force-pushed the breaking-models-out-POC branch 3 times, most recently from 148a648 to f67a509 Compare July 30, 2024 18:15
@YrrepNoj YrrepNoj changed the title chore: Proof of concept: Remove model weights from container images chore: Remove model weights from container images Jul 30, 2024
@YrrepNoj YrrepNoj removed this from the EVERGREEN milestone Jul 30, 2024
@justinthelaw
Copy link
Contributor

justinthelaw commented Jul 31, 2024

The Dockerfiles still contain a lot of extra ENVs and ARGs. I recommend removal and an update of the READMEs to add instructions for switching models and also any other mechanics that need to be explained to potential delivery engineers or users that use these.

@YrrepNoj YrrepNoj merged commit 33e4efb into main Aug 1, 2024
30 of 36 checks passed
@YrrepNoj YrrepNoj deleted the breaking-models-out-POC branch August 1, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants