Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: install kfp on github periodic functional tests workflow #10859

Conversation

tmvfb
Copy link
Contributor

@tmvfb tmvfb commented Jun 2, 2024

Description of your changes:
Hopefully last fix for periodic functional tests. Current problems (#10851):

  1. Tests fail in GH workflows as they try to connect to the same GCP cluster that prow tests run on. The corresponding environment is not configured in GitHub.
  2. Tests fail in prow as they try to run sudo which is not available on python3.7-slim (runs on debian).

Solutions:

  1. For GH actions, use the kind cluster defined in the workflow file to run the test instead of GCP test cluster. This requires actually installing kfp on this kind cluster. I copypasted installation from the backend test workflow file. Let me know if there is a simpler/faster solution (test runs for 15 mins), standalone kfp deployment didn't work for me.
  2. Make tests aware of the environment they are running on. If running in GitHub actions (we can check if KIND_REGISTRY env var created by kind action is available), tests work without indicating host. For prow tests, run apt without sudo, as it was initially defined.

Checklist:

Copy link

google-cla bot commented Jun 2, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link

Hi @tmvfb. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch from ca941ce to d486dbe Compare June 2, 2024 08:23
@hbelmiro
Copy link
Contributor

hbelmiro commented Jun 3, 2024

/ok-to-test

Copy link
Member

@rimolive rimolive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contributions! I left a few comments, let's work on these items before lgtm/approve


python3 "${source_root}/test/kfp-functional-test/run_kfp_functional_test.py" --host "${HOST}"
if [ -n "$KIND_REGISTRY" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this variable is specified when GH Action is running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable gets created when the kind action runs, see here. If we don't need prow tests, we can remove this conditional anyway.

- name: Build images
run: ./scripts/deploy/github/build-images.sh
- name: Deploy kfp-tekton
run: ./scripts/deploy/github/deploy-kfp.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new lines are supposed to be only used in kfp-tekton backend tests. Periodic tests don't run any tests against kfp-tekton.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can run the tests against standalone kfp then? I realised that my tests didn't work since I didn't allow for any time for the kfp pods to come up in the test cluster.

if [ -n "$KIND_REGISTRY" ]; then
# if KIND_REGISTRY is available, we're running GH actions, host is not required
python3 "${source_root}/test/kfp-functional-test/run_kfp_functional_test.py" --host ""
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect to remove the Prow jobs to run these tests, I think this confitional is not required.

@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch from d486dbe to d0b7262 Compare June 3, 2024 23:33
@tmvfb
Copy link
Contributor Author

tmvfb commented Jun 3, 2024

@rimolive, completely reworked the PR, now we deploy standalone kfp inside GitHub actions kind cluster instead of kfp-tekton. I also simplified shell script considering we won't need prow tests anymore.

Copy link
Member

@rimolive rimolive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment.

test/kfp-functional-test/kfp-functional-test.sh Outdated Show resolved Hide resolved
@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch 2 times, most recently from c9907bd to 1e0bdb6 Compare June 5, 2024 20:21
@rimolive
Copy link
Member

rimolive commented Jun 5, 2024

Don't forget to sign-off your commits

@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch from 1e0bdb6 to 99298b8 Compare June 5, 2024 20:32
@tmvfb
Copy link
Contributor Author

tmvfb commented Jun 5, 2024

Don't forget to sign-off your commits

Done!

Copy link

@tmvfb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipeline-e2e-test 99298b8 link false /test kubeflow-pipeline-e2e-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@rimolive
Copy link
Member

rimolive commented Jun 5, 2024

Thank you!

/lgttm

cc @DharmitD @chensun

Copy link
Contributor

@DharmitD DharmitD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes /lgtm

@tmvfb could you provide a sample run on your fork with these changes?

@tmvfb
Copy link
Contributor Author

tmvfb commented Jun 5, 2024

@DharmitD, https://github.com/prokube-ai/pipelines/actions/runs/9390604598

Copy link
Contributor

@DharmitD DharmitD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rimolive
Copy link
Member

rimolive commented Jun 6, 2024

/approve

@rimolive
Copy link
Member

rimolive commented Jun 6, 2024

cc @chensun

@rimolive
Copy link
Member

@chensun Can you approve?

Copy link
Contributor

@DharmitD DharmitD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/Approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DharmitD, rimolive

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hbelmiro
Copy link
Contributor

hbelmiro commented Jul 1, 2024

@tmvfb can you please rebase?

@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch from 99298b8 to 056ffaf Compare July 1, 2024 16:47
@google-oss-prow google-oss-prow bot removed the lgtm label Jul 1, 2024
@tmvfb
Copy link
Contributor Author

tmvfb commented Jul 1, 2024

@tmvfb can you please rebase?

@hbelmiro, @DharmitD, done

kubectl_version: v1.29.2
version: v0.22.0
node_image: kindest/node:v1.29.2
- name: Deploy kfp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing @tmvfb. Sorry.
Can you please use the new kfp-cluster action here? Like what is done in

- name: Create KFP cluster
.

Copy link
Contributor Author

@tmvfb tmvfb Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hbelmiro, no problem, edited commits and tested here. Sorry for so many force pushes :D

@hbelmiro
Copy link
Contributor

hbelmiro commented Jul 1, 2024

/hold

@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch 2 times, most recently from 8029cb8 to 60be588 Compare July 2, 2024 10:24
tmvfb added 2 commits July 2, 2024 12:33
Signed-off-by: Igor Kvachenok <igor.kvachenok@prokube.ai>
Signed-off-by: Igor Kvachenok <igor.kvachenok@prokube.ai>
@tmvfb tmvfb force-pushed the fix/functional_tests_missing_kfp_cluster branch from 60be588 to e14d5c8 Compare July 2, 2024 10:35
Copy link
Contributor

@hbelmiro hbelmiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/unhold
/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Jul 2, 2024
@google-oss-prow google-oss-prow bot merged commit 7d2ec39 into kubeflow:master Jul 2, 2024
3 checks passed
@tmvfb tmvfb deleted the fix/functional_tests_missing_kfp_cluster branch July 5, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants