Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider fetching one model from Git rather than S3 #57

Closed
grdryn opened this issue Sep 5, 2023 · 8 comments · Fixed by #112
Closed

Consider fetching one model from Git rather than S3 #57

grdryn opened this issue Sep 5, 2023 · 8 comments · Fixed by #112
Assignees
Labels
feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge good first issue Good for newcomers kind/enhancement New feature or request priority/high Important issue that needs to be resolved asap. Releases should not have too many o

Comments

@grdryn
Copy link
Member

grdryn commented Sep 5, 2023

We have two example models, and associated image build pipelines that pull those models from S3. As a reusable demo, it would be easier to set up if one of these (the smaller one, ideally) was pulled from Git, so that the person demoing can just interact with that one, and doesn't have to also have an AWS account and access to an S3 bucket (either a shared one that we provide the location for, or one of their own where they can put a trained model).

@piotrpdev
Copy link
Member

When testing I also used git clone to get the model, I mentioned more ways of getting the model in the pull request with these changes. Here it is:

Apologies if the naming is confusing, the download is done in pipelines/tekton/azureml-container-pipeline/kserve-download-model.yaml. This downloads the model files from an S3 storage bucket, which include Containerfiles like the models in the repo do.

Edgar's Gist has some more examples of how the download is done, including downloading from Azure (I haven't implemented this yet). If Git is involved, a git clone could be made inside the Containerfile or using the git clone Tekton Task which is included with OpenShift Pipelines

@adelton
Copy link
Contributor

adelton commented Sep 6, 2023

We have two example models, and associated image build pipelines ...

Are the models and pipelines those at https://github.com/opendatahub-io/ai-edge/tree/main/pipelines/models?

@grdryn
Copy link
Member Author

grdryn commented Sep 6, 2023

@adelton yes, those are the two models that are used. Currently, instead of being consumed from there (or from some clone of this repo in Gitea), they're being downloaded from AWS S3 during the image build pipeline here, which means that to reproduce the demo, someone needs to configure S3 and put the models there, etc.

Whether they should actually be checked-in to a Git repo is up for debate (especially for larger models), so I think the thing we're trying to optimize for is ease of setup for the demo.

Another alternative approach that @LaVLaS suggested is that minio could be provisioned in the cluster and have them there. I think someone else suggested that they could be artifacts on a GitHub release, attached by the end of some model training pipeline or something.

@LaVLaS
Copy link
Contributor

LaVLaS commented Sep 6, 2023

I think model size is the determining factor in where the versioned model lives. If we think the models for a POC will be small enough for committing to git then we it would be much simpler to use git as the "model registry". A minio install is easy enough that we could add that as an application to be deployed and we can replace it, with another object stora, as needed for any use case with specific needs

@adelton
Copy link
Contributor

adelton commented Oct 4, 2023

What mechanism is used in the non-edge Open Data Hub or RHODS to configure or distinguish those two approaches? This looks like a generic enough topic that the Edge PoC should simply be able to inherit it from whatever Open Data Hub or RHODS demo, and get all those options for free.

Am I missing something?

@grdryn
Copy link
Member Author

grdryn commented Oct 4, 2023

@adelton my understanding is that in ODH, the more normal thing to do would be to fetch the model at run time into the model server, rather than at build time of an image containing both server and model. I could be wrong, but my guess is that OpenShift Pipelines/Tekton also haven't traditionally been used for that in ODH, since no image needed to be created: ODH would provided pre-built model server images, users would provide a model, and those would meet at runtime (I'm actually not sure how/where a model is specified in the normal ODH way).

Why do we build an image with both the model server and the model? I think this may have been a decision that was made when also thinking about far edge scenarios where there might not even be a *shift to run the usual model serving setup that ODH uses. I think there was also consideration given to the fact that in edge locations, there might not be (m)any technical staff to fix things when they go awry, so running an immutable image might be considered simpler than trying to dynamically load at run time.
I'm not sure how convincing or correct that last part is. One thing that @danielezonca (I think?) mentioned before is that even with two models, it's probably more resource efficient to use a shared model server rather.

Edit: So while it would be good to get all of this to be first-class in ODH, there currently isn't an equivalent really, in my understanding.

I hope at least some of this is related to your question :)

@adelton
Copy link
Contributor

adelton commented Oct 4, 2023

It is.

I did not realize that in ODH the standard way is to use ODH-provided pre-built model server images.

@piotrpdev piotrpdev added the priority/high Important issue that needs to be resolved asap. Releases should not have too many o label Oct 5, 2023
@LaVLaS
Copy link
Contributor

LaVLaS commented Oct 11, 2023

The model serving frameworks allow a data scientist to abstract away the implementation details of the gRPC/REST framework to serve (host) their model for inferencing. Pulling the model from S3 is the standard method so you can easily swap model versions on demand without having to rebuild the model server

I think there was also consideration given to the fact that in edge locations, there might not be (m)any technical staff to fix things when they go awry, so running an immutable image might be considered simpler than trying to dynamically load at run time.

Correct again. In edge scenarios, updating the model is a similar workflow to a flashing a bios. You want to keep the process as simple and self contained as possible so anyone can simply insert a update media, wait for the update to complete and walk away knowing that the system is running with the latest updates

@LaVLaS LaVLaS added the feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge label Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/pipelines Support for MLOps pipelines that package and deliver models to the Edge good first issue Good for newcomers kind/enhancement New feature or request priority/high Important issue that needs to be resolved asap. Releases should not have too many o
Projects
Status: Done
Status: Done
Status: No status
Development

Successfully merging a pull request may close this issue.

6 participants