ADR: Breaking models weights out of model images #752

YrrepNoj · 2024-07-10T16:04:35Z

Related to #623

netlify · 2024-07-10T16:04:52Z

✅ Deploy Preview for leapfrogai-docs canceled.

Name	Link
🔨 Latest commit	`dc555c6`
🔍 Latest deploy log	https://app.netlify.com/sites/leapfrogai-docs/deploys/668eb115515668000880c04a

barronstone

Excited to see this approach implemented!

barronstone · 2024-07-12T14:35:16Z

adr/0006-model-registry.md

+Proposed
+
+## Context
+GenerativeAI models are big. Because LeapfrogAI is designed to be deployable into AirGapped environments, we need to ensure that we are bringing the big GenerativeAI models with us. Currently, we are brining the AI models with us by backing them into our container images. For example, [we download synthia into our llama-cpp-python image](https://github.com/defenseunicorns/leapfrogai/blob/d1e42d9296f6e014ffbbcec2ba295443b1675567/packages/llama-cpp-python/Dockerfile#L15) and here we [download whisper](https://github.com/defenseunicorns/leapfrogai/blob/d1e42d9296f6e014ffbbcec2ba295443b1675567/packages/whisper/Dockerfile#L14) into our whisper image. Some of the models we are trying to use are large (several GBs).


nit: mispelled "baking"

barronstone · 2024-07-12T14:43:44Z

adr/0006-model-registry.md

+	- The initialization time of pods is increased because of time spent moving the containers OCI layers into the pod.
+
+## Decision
+While no decision has been made yet, I am leaning towards proposing we go with the simplest solution of using PVCs to manage our GenAI models.


Since this is still in "proposed" status, I would state PVCs as the decision rather than "leaning towards proposing" ...you are proposing!

barronstone · 2024-07-12T14:44:11Z

adr/0006-model-registry.md

+
+
+## Rationale
+N/A as no [Decision](#Decision) has been made yet.


Provide (proposed) rationale for PVC path forward

barronstone · 2024-07-12T14:51:45Z

adr/0006-model-registry.md

+### Raw PVC Attachments
+[k8s PVC Docs](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
+
+Maybe the simplest solution is the best solution? We can create a PersistantVolume for each model that gets populated during deploy time. This PersistantVolume will be mounted by all of the Pods that want to use that model.


You say PersistantVolume here, but then use the acronym PVC (PersistentVolumeClaim) elsewhere, which is a request for the PV resources. Be consistent and/or clarify the relationship in a way non-k8s folks like me can understand easily when reading.

Would also be good to define the PVC acronym somewhere in this doc. You currently spell out PersistantVolume, but jump straight to using PVC as an acronym.

barronstone · 2024-07-12T14:53:10Z

adr/0006-model-registry.md

+
+
+Cons of PVC:
+- Hard to optimize re-deploys (If the model weights don't change) and benefit from caching.


Can you elaborate on why it's hard to optimize re-deploys? (Maybe it's obvious to k8s folks, but I don't understand.)

also nit: the "I" in "If" is capitalized

barronstone · 2024-09-09T23:52:12Z

Greetings @YrrepNoj

chore: start ADR for model registry

dc555c6

gphorvath assigned YrrepNoj Jul 10, 2024

alekst23 added this to the Current - RAG UX Enhancements | Model Directory | API Odds and Ends milestone Jul 11, 2024

barronstone requested changes Jul 12, 2024

View reviewed changes

justinthelaw mentioned this pull request Jul 23, 2024

EPIC: IronBank LeapfrogAI Hardening #750

Open

justinthelaw added the ADR 🧐 Architecture Decision Record label Sep 4, 2024

jalling97 linked an issue Sep 5, 2024 that may be closed by this pull request

spike: model files pvc image hardening example #979

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR: Breaking models weights out of model images #752

ADR: Breaking models weights out of model images #752

YrrepNoj commented Jul 10, 2024 •

edited

Loading

netlify bot commented Jul 10, 2024 •

edited

Loading

barronstone left a comment

barronstone Jul 12, 2024

barronstone Jul 12, 2024

barronstone Jul 12, 2024

barronstone Jul 12, 2024

barronstone Jul 12, 2024 •

edited

Loading

barronstone commented Sep 9, 2024



		## Rationale
		N/A as no [Decision](#Decision) has been made yet.



		Cons of PVC:
		- Hard to optimize re-deploys (If the model weights don't change) and benefit from caching.

ADR: Breaking models weights out of model images #752

Are you sure you want to change the base?

ADR: Breaking models weights out of model images #752

Conversation

YrrepNoj commented Jul 10, 2024 • edited Loading

netlify bot commented Jul 10, 2024 • edited Loading

✅ Deploy Preview for leapfrogai-docs canceled.

barronstone left a comment

Choose a reason for hiding this comment

barronstone Jul 12, 2024

Choose a reason for hiding this comment

barronstone Jul 12, 2024

Choose a reason for hiding this comment

barronstone Jul 12, 2024

Choose a reason for hiding this comment

barronstone Jul 12, 2024

Choose a reason for hiding this comment

barronstone Jul 12, 2024 • edited Loading

Choose a reason for hiding this comment

barronstone commented Sep 9, 2024

YrrepNoj commented Jul 10, 2024 •

edited

Loading

netlify bot commented Jul 10, 2024 •

edited

Loading

barronstone Jul 12, 2024 •

edited

Loading