-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider fetching one model from Git rather than S3 #57
Comments
When testing I also used
|
Are the models and pipelines those at https://github.com/opendatahub-io/ai-edge/tree/main/pipelines/models? |
@adelton yes, those are the two models that are used. Currently, instead of being consumed from there (or from some clone of this repo in Gitea), they're being downloaded from AWS S3 during the image build pipeline here, which means that to reproduce the demo, someone needs to configure S3 and put the models there, etc. Whether they should actually be checked-in to a Git repo is up for debate (especially for larger models), so I think the thing we're trying to optimize for is ease of setup for the demo. Another alternative approach that @LaVLaS suggested is that minio could be provisioned in the cluster and have them there. I think someone else suggested that they could be artifacts on a GitHub release, attached by the end of some model training pipeline or something. |
I think model size is the determining factor in where the versioned model lives. If we think the models for a POC will be small enough for committing to git then we it would be much simpler to use git as the "model registry". A minio install is easy enough that we could add that as an application to be deployed and we can replace it, with another object stora, as needed for any use case with specific needs |
What mechanism is used in the non-edge Open Data Hub or RHODS to configure or distinguish those two approaches? This looks like a generic enough topic that the Edge PoC should simply be able to inherit it from whatever Open Data Hub or RHODS demo, and get all those options for free. Am I missing something? |
@adelton my understanding is that in ODH, the more normal thing to do would be to fetch the model at run time into the model server, rather than at build time of an image containing both server and model. I could be wrong, but my guess is that OpenShift Pipelines/Tekton also haven't traditionally been used for that in ODH, since no image needed to be created: ODH would provided pre-built model server images, users would provide a model, and those would meet at runtime (I'm actually not sure how/where a model is specified in the normal ODH way). Why do we build an image with both the model server and the model? I think this may have been a decision that was made when also thinking about far edge scenarios where there might not even be a *shift to run the usual model serving setup that ODH uses. I think there was also consideration given to the fact that in edge locations, there might not be (m)any technical staff to fix things when they go awry, so running an immutable image might be considered simpler than trying to dynamically load at run time. Edit: So while it would be good to get all of this to be first-class in ODH, there currently isn't an equivalent really, in my understanding. I hope at least some of this is related to your question :) |
It is. I did not realize that in ODH the standard way is to use ODH-provided pre-built model server images. |
The model serving frameworks allow a data scientist to abstract away the implementation details of the gRPC/REST framework to serve (host) their model for inferencing. Pulling the model from S3 is the standard method so you can easily swap model versions on demand without having to rebuild the model server
Correct again. In edge scenarios, updating the model is a similar workflow to a flashing a bios. You want to keep the process as simple and self contained as possible so anyone can simply insert a update media, wait for the update to complete and walk away knowing that the system is running with the latest updates |
We have two example models, and associated image build pipelines that pull those models from S3. As a reusable demo, it would be easier to set up if one of these (the smaller one, ideally) was pulled from Git, so that the person demoing can just interact with that one, and doesn't have to also have an AWS account and access to an S3 bucket (either a shared one that we provide the location for, or one of their own where they can put a trained model).
The text was updated successfully, but these errors were encountered: