-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly (regular) build of container images #666
Comments
@jlewi what do you think about just doing this as part of the postsubmit process? Just check that the |
So triggering it based on postsubmits is ideal. Here's the challenge. We don't want uncommitted code to be able to write our release artifacts (including nightly builds). Our test infrastructure runs "minimally vetted code" in the sense that any PR labeled /ok-to-test will run the code. I don't want unmerged code to be able to write/modify our release artifacts.. Also the way we currently use Prow to trigger workflows is the K8s ci bot creates arbitrary argo workflows. Which creates another large whole for code to be injected that could modify our release artifacts. All of this solvable but we would need to think it through and lock it down. Right now we build our release artifacts in a separate GCP project in a separate GKE cluster. So none of our test infra has access to our release repositories. So we just need to figure out how to trigger jobs in that cluster without comprising security. |
This project looks promising as a way of building container images in cluster |
Post submits don't run until code is commited already correct? And I meant it in that we can release to the staging area during post submit and manually release official images. |
That's correct. But our post submit tests are running using the same credentials as our presubmits. So in the current setup we can't give access to the staging repository for our postsubmits without also granting that permission to code running in presubmits. At that point we can no longer trust the images in the staging area. |
/assign @kunmingg |
@kunmingg Any progress on this in the last sprint? |
@jlewi May need another 2 days? |
@kunmingg What's the status of this? Which Docker images are now being built regularly? |
@jlewi Now we have code merged in, and I'll turn on cron job today. |
@kunmingg It looks like the cron job was enabled and new images were built e.g. Can we close this? |
Seems we have some failed release tasks, like tf serving gpu, need to verify we have enough hardware resource for auto release. |
Will update Dockerfile to point to resources within GCP. Creating new issue for it. |
/close |
We are adding more and more Docker images (e.g. PyTorch, Central UI, katib etc...).
We need a way to regularly rebuild and push the latest images to a public repository e.g. gcr.io/kubeflow-images-staging or gcr.io/kubeflow-ci
Currently this is all done manually.
Our release process consists of a bunch of Argo workflows.
So if we wanted to automate this we could just create a cron job to invoke run_e2e_workflow.py using a YAML file that specifies all the workflows for a release.
@willb Will any interest in taking this on?
/cc @jose5918
/cc @willb
The text was updated successfully, but these errors were encountered: