Need documentation for air-gapped (offline) on-prem deployment #1405

Yu-amd · 2025-01-16T23:51:44Z

OPEA should provide documentation and reference architecture on the mechanisms for storing and deploying applications along with all the dependencies (e.g., container images, Helm charts) or host model repositories locally.

Enterprises operating in secure environments need fully offline solution.

eero-t · 2025-01-20T13:31:05Z

I don't think documentation is enough.

Currently model downloading is done by each container separately when they start, and those services having write access to that volume. Meaning that user/admin may not even know whether node will run out disk before all those services are ready...

I think there should be a separate model downloader, that is used to pre-fetch all relevant models to the model volume, and that volume would be set as read-only afterwards. IMHO this should be how it's done (documented to be done) by default. Models being downloaded at run-time should be an exception.

eero-t · 2025-01-20T13:45:43Z

Helm charts are already using HF downloader in initContainers: https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/vllm/templates/deployment.yaml#L53

There could be a separate script / container using that, which would download all specified models to a location expected by the services. Models could be specified either directly, or script could e.g. pick their names from the listed service specs / Helm charts.

PS. One more advantage of this would not needing to provide the secret HF token to all the inferencing services.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need documentation for air-gapped (offline) on-prem deployment #1405

Need documentation for air-gapped (offline) on-prem deployment #1405

Yu-amd commented Jan 16, 2025

eero-t commented Jan 20, 2025 •

edited

Loading

eero-t commented Jan 20, 2025

Need documentation for air-gapped (offline) on-prem deployment #1405

Need documentation for air-gapped (offline) on-prem deployment #1405

Comments

Yu-amd commented Jan 16, 2025

eero-t commented Jan 20, 2025 • edited Loading

eero-t commented Jan 20, 2025

eero-t commented Jan 20, 2025 •

edited

Loading