Consider fetching one model from Git rather than S3 #57

grdryn · 2023-09-05T11:46:43Z

We have two example models, and associated image build pipelines that pull those models from S3. As a reusable demo, it would be easier to set up if one of these (the smaller one, ideally) was pulled from Git, so that the person demoing can just interact with that one, and doesn't have to also have an AWS account and access to an S3 bucket (either a shared one that we provide the location for, or one of their own where they can put a trained model).

piotrpdev · 2023-09-05T14:02:54Z

When testing I also used git clone to get the model, I mentioned more ways of getting the model in the pull request with these changes. Here it is:

Apologies if the naming is confusing, the download is done in pipelines/tekton/azureml-container-pipeline/kserve-download-model.yaml. This downloads the model files from an S3 storage bucket, which include Containerfiles like the models in the repo do.

Edgar's Gist has some more examples of how the download is done, including downloading from Azure (I haven't implemented this yet). If Git is involved, a git clone could be made inside the Containerfile or using the git clone Tekton Task which is included with OpenShift Pipelines

adelton · 2023-09-06T13:38:24Z

We have two example models, and associated image build pipelines ...

Are the models and pipelines those at https://github.com/opendatahub-io/ai-edge/tree/main/pipelines/models?

grdryn · 2023-09-06T15:37:16Z

@adelton yes, those are the two models that are used. Currently, instead of being consumed from there (or from some clone of this repo in Gitea), they're being downloaded from AWS S3 during the image build pipeline here, which means that to reproduce the demo, someone needs to configure S3 and put the models there, etc.

Whether they should actually be checked-in to a Git repo is up for debate (especially for larger models), so I think the thing we're trying to optimize for is ease of setup for the demo.

Another alternative approach that @LaVLaS suggested is that minio could be provisioned in the cluster and have them there. I think someone else suggested that they could be artifacts on a GitHub release, attached by the end of some model training pipeline or something.

LaVLaS · 2023-09-06T17:20:43Z

I think model size is the determining factor in where the versioned model lives. If we think the models for a POC will be small enough for committing to git then we it would be much simpler to use git as the "model registry". A minio install is easy enough that we could add that as an application to be deployed and we can replace it, with another object stora, as needed for any use case with specific needs

adelton · 2023-10-04T12:03:02Z

What mechanism is used in the non-edge Open Data Hub or RHODS to configure or distinguish those two approaches? This looks like a generic enough topic that the Edge PoC should simply be able to inherit it from whatever Open Data Hub or RHODS demo, and get all those options for free.

Am I missing something?

grdryn · 2023-10-04T14:00:59Z

@adelton my understanding is that in ODH, the more normal thing to do would be to fetch the model at run time into the model server, rather than at build time of an image containing both server and model. I could be wrong, but my guess is that OpenShift Pipelines/Tekton also haven't traditionally been used for that in ODH, since no image needed to be created: ODH would provided pre-built model server images, users would provide a model, and those would meet at runtime (I'm actually not sure how/where a model is specified in the normal ODH way).

Why do we build an image with both the model server and the model? I think this may have been a decision that was made when also thinking about far edge scenarios where there might not even be a *shift to run the usual model serving setup that ODH uses. I think there was also consideration given to the fact that in edge locations, there might not be (m)any technical staff to fix things when they go awry, so running an immutable image might be considered simpler than trying to dynamically load at run time.
I'm not sure how convincing or correct that last part is. One thing that @danielezonca (I think?) mentioned before is that even with two models, it's probably more resource efficient to use a shared model server rather.

Edit: So while it would be good to get all of this to be first-class in ODH, there currently isn't an equivalent really, in my understanding.

I hope at least some of this is related to your question :)

adelton · 2023-10-04T19:30:58Z

It is.

I did not realize that in ODH the standard way is to use ODH-provided pre-built model server images.

LaVLaS · 2023-10-11T04:01:01Z

The model serving frameworks allow a data scientist to abstract away the implementation details of the gRPC/REST framework to serve (host) their model for inferencing. Pulling the model from S3 is the standard method so you can easily swap model versions on demand without having to rebuild the model server

I think there was also consideration given to the fact that in edge locations, there might not be (m)any technical staff to fix things when they go awry, so running an immutable image might be considered simpler than trying to dynamically load at run time.

Correct again. In edge scenarios, updating the model is a similar workflow to a flashing a bios. You want to keep the process as simple and self contained as possible so anyone can simply insert a update media, wait for the update to complete and walk away knowing that the system is running with the latest updates

devguyio added this to AI Edge Tracking Sep 5, 2023

github-project-automation bot moved this to Backlog in AI Edge Tracking Sep 5, 2023

devguyio moved this from Backlog to Todo in AI Edge Tracking Sep 5, 2023

grdryn mentioned this issue Sep 12, 2023

Add on-prem S3 to the use case as the default for s3 storage for any edge demo #70

Open

devguyio added good first issue Good for newcomers kind/enhancement New feature or request labels Sep 12, 2023

devguyio moved this from Todo to In Progress in AI Edge Tracking Sep 12, 2023

piotrpdev assigned biswassri Sep 12, 2023

This was referenced Sep 21, 2023

Let pipelines use model from PVC #87

Open

Trained AzureML model prerequisite and azureml-container #94

Closed

biswassri mentioned this issue Oct 2, 2023

[RHOAIEDGE-15] Adding option to fetch from git #112

Merged

3 tasks

piotrpdev added the priority/high Important issue that needs to be resolved asap. Releases should not have too many o label Oct 5, 2023

adelton mentioned this issue Oct 9, 2023

The Edge PoC should be based on some standard Open Data Hub demo / training #136

Open

LaVLaS added the feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge label Oct 26, 2023

openshift-ai-project-manager bot added this to Internal tracking and ODH Feature Tracking Oct 26, 2023

openshift-merge-bot bot closed this as completed in #112 Nov 15, 2023

github-project-automation bot moved this from In Progress to Done in AI Edge Tracking Nov 15, 2023

github-project-automation bot moved this to Done in Internal tracking Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider fetching one model from Git rather than S3 #57

Consider fetching one model from Git rather than S3 #57

grdryn commented Sep 5, 2023

piotrpdev commented Sep 5, 2023

adelton commented Sep 6, 2023

grdryn commented Sep 6, 2023

LaVLaS commented Sep 6, 2023

adelton commented Oct 4, 2023

grdryn commented Oct 4, 2023 •

edited

Loading

adelton commented Oct 4, 2023

LaVLaS commented Oct 11, 2023

Consider fetching one model from Git rather than S3 #57

Consider fetching one model from Git rather than S3 #57

Comments

grdryn commented Sep 5, 2023

piotrpdev commented Sep 5, 2023

adelton commented Sep 6, 2023

grdryn commented Sep 6, 2023

LaVLaS commented Sep 6, 2023

adelton commented Oct 4, 2023

grdryn commented Oct 4, 2023 • edited Loading

adelton commented Oct 4, 2023

LaVLaS commented Oct 11, 2023

grdryn commented Oct 4, 2023 •

edited

Loading