Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate kubeflow-katib-presubmit to GitHub Actions #1882

Merged
merged 18 commits into from
Jun 6, 2022

Conversation

tenzen-y
Copy link
Member

@tenzen-y tenzen-y commented Jun 3, 2022

What this PR does / why we need it:
I migrated E2E tests from kubeflow-testinfra to GitHub Actions with KinD Kubernetes Cluster since we no longer have access to kubeflow-katib-presubmit.
Ref: #1838

So, I made the following changes:

  1. Moved old E2E test codes to test/e2e/v1beta1/[scripts|hack]/aws directory.
  2. Created new E2E test codes in test/e2e/v1beta1/[scripts|hack]/gh-actions direcotory.
  3. Built new workflows similar to kubeflow-katib-presubmit in GitHub Actions.
  4. As discussed in our community meeting, reduced the following container images for Suggestion and EarlyStopping Services by changing the base image from python:3.9 to python:3.9-slim:
    • cmd/earlystopping/medianstop/v1beta1/Dockerfile
    • cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile
    • cmd/suggestion/chocolate/v1beta1/Dockerfile
    • cmd/suggestion/hyperband/v1beta1/Dockerfile
    • cmd/suggestion/hyperopt/v1beta1/Dockerfile
    • cmd/suggestion/nas/darts/v1beta1/Dockerfile
    • cmd/suggestion/nas/enas/v1beta1/Dockerfile
    • cmd/suggestion/optuna/v1beta1/Dockerfile
    • cmd/suggestion/skopt/v1beta1/Dockerfile
    • examples/v1beta1/trial-images/enas-cnn-cifar10/Dockerfile.cpu
  5. Reduced the following container images for Trial containers by splitting container images by device types (CPU or GPU):
    • examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.cpu
    • examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.gpu
    • examples/v1beta1/trial-images/pytorch-mnist/Dockerfile
  6. Changed parameters of examples to reduce required computer resources (ex., system memory, storage size, etc.).

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1878

/cc @andreyvelich @johnugeorge @gaocegege

Checklist:

  • Docs included if any changes are user facing

@coveralls
Copy link

coveralls commented Jun 3, 2022

Coverage Status

Coverage remained the same at 73.948% when pulling 2d21f08 on tenzen-y:migrate-test-infra-to-gha into 91974f1 on kubeflow:master.

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! 🎉 👍

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gaocegege, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

docker builder prune
}

run() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, core images will be built for every job while suggestion/early stopping images will be built if used in the experiment.
Right? Is there any other way that we can avoid duplicate build?

Is this https://github.com/docker/build-push-action/blob/master/docs/advanced/share-image-jobs.md helpful in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know your thoughts on it. /cc @andreyvelich

Copy link
Member

@johnugeorge johnugeorge Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can relook at optimising it later. This looks good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, core images will be built for every job while suggestion/early stopping images will be built if used in the experiment.

Yes, that's right. @johnugeorge

Is this https://github.com/docker/build-push-action/blob/master/docs/advanced/share-image-jobs.md helpful in this case?

SGTM
I did not know how to do that.
Thanks! @johnugeorge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can relook at optimising it later. This looks good.

After merging this PR, I would create another PR to optimize the build script.
WDYT? @johnugeorge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure Thanks @tenzen-y

@johnugeorge
Copy link
Member

Thanks @tenzen-y for this hard work

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2022

Thanks @tenzen-y for this hard work

Thank you for your comments and review.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2022

/hold

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2022

@johnugeorge Please take another look.

/hold cancel

@johnugeorge
Copy link
Member

Sure. I will keep it open for 1-2 more days so that others can take a look at it.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2022

Sure. I will keep it open for 1-2 more days so that others can take a look at it.

It makes sense.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 4, 2022

I have pushed to rerun Go Test.

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@johnugeorge
Copy link
Member

Thanks @tenzen-y
/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate kubeflow-katib-presubmit to GitHub Actions
4 participants