Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk+backend): Add support for generic ephemeral volume #10605

Merged
merged 4 commits into from
Apr 9, 2024

Conversation

abaland
Copy link
Contributor

@abaland abaland commented Mar 22, 2024

Description of your changes:

This change is the second half of the intended change in #10602 but adds the actual support for ephemeral volume in SDK and KFP backend.

I haven't actually tested this locally yet (I think I'll need to update the go.mod once the other PR is approved/merged/deployed), but I think the code will mostly stay the same as long as no bugs occurs when i try this out in our KFP fork

Checklist:

Copy link

Hi @abaland. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -17,7 +17,7 @@
from kfp import kubernetes


class TestUseSecretAsVolume:
class TestNodeSelector:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely unrelated, just saw this when debugging unit tests

@abaland abaland force-pushed the feat/ephemeral_volume2 branch 2 times, most recently from db85cbe to 4f15bc1 Compare March 22, 2024 13:52
@Tomcli
Copy link
Member

Tomcli commented Mar 25, 2024

/ok-to-test

@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

/retest

@abaland abaland force-pushed the feat/ephemeral_volume2 branch 2 times, most recently from 6273707 to a7efe78 Compare March 26, 2024 06:16
@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

/retest

@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

Is there a way to get access to the Artifacts produced by the above tests?
I found the cause for 2 of the failures (forgot to mod tidy and to updates license) but the rest one is an error at integration tests preparation and if i try to look at the artifacts in https://oss.gprow.dev/view/gs/oss-prow/pr-logs/pull/kubeflow_pipelines/10605/kubeflow-pipeline-upgrade-test/1772473169093857280 I need to authenticate with a google.com account it seems.

@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

Mhhh, one of the errors (samples-v2) is

  File "sample_test.py", line 21, in <module>
    import kfp.deprecated as kfp
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/__init__.py", line 21, in <module>
    from ._client import Client
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/_client.py", line 97, in <module>
    class Client(object):
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/_client.py", line 396, in Client
    def get_kfp_healthz(self) -> kfp_server_api.ApiGetHealthzResponse:
AttributeError: module 'kfp_server_api' has no attribute 'ApiGetHealthzResponse'

One (backend-test) is

ok  	github.com/kubeflow/pipelines/backend/src/v2/objectstore	0.056s	coverage: 14.0% of statements
FAIL	github.com/kubeflow/pipelines/backend/test/initialization [build failed]
FAIL	github.com/kubeflow/pipelines/backend/test/integration [build failed]
FAIL	github.com/kubeflow/pipelines/backend/test/v2/initialization [build failed]
FAIL	github.com/kubeflow/pipelines/backend/test/v2/integration [build failed]

which i could reproduce locally by just executing the test folder and got

# github.com/kubeflow/pipelines/backend/src/common/client/api_server/v1
../../src/common/client/api_server/v1/experiment_client.go:33:24: undefined: params.CreateExperimentV1Params
../../src/common/client/api_server/v1/experiment_client.go:34:21: undefined: params.GetExperimentV1Params
../../src/common/client/api_server/v1/experiment_client.go:35:22: undefined: params.ListExperimentsV1Params
../../src/common/client/api_server/v1/experiment_client.go:36:25: undefined: params.ListExperimentsV1Params
../../src/common/client/api_server/v1/experiment_client.go:37:25: undefined: params.ArchiveExperimentV1Params
../../src/common/client/api_server/v1/experiment_client.go:38:27: undefined: params.UnarchiveExperimentV1Params
../../src/common/client/api_server/v1/healthz_client.go:31:24: undefined: params.GetHealthzOK
../../src/common/client/api_server/v1/job_client.go:33:24: undefined: params.CreateJobParams
../../src/common/client/api_server/v1/job_client.go:34:21: undefined: params.GetJobParams
../../src/common/client/api_server/v1/job_client.go:35:24: undefined: params.DeleteJobParams
../../src/common/client/api_server/v1/job_client.go:35:24: too many errors
FAIL	github.com/kubeflow/pipelines/backend/test/initialization [build failed]

I don't think this MR touched anything related to the above, which makes me think this is related to the multiple MR from last night's and the KFP api server release that came with it.

@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

For the first error, I see that it used to be ApiGetHealthzResponse in

from kfp_server_api.models.api_get_healthz_response import ApiGetHealthzResponse
for the v1beta1, but got switched to V2beta1GetHealthzResponse in https://github.com/kubeflow/pipelines/blob/5399585b6a0f92446bcfc5a7588f2a85ea0fe6a3/backend/api/v2beta1/python_http_client/kfp_server_api/__init__.py#L55C1-L56C1
so i guess i could switch other all kfp_server_api.Api* to kfp_server_api.V2beta1*, but I'll admit i have no idea why this is happening and where/how this changed. So any pointer is more than welcome

@rimolive
Copy link
Member

@abaland With #10602 merged, you're free to remove WIP label in this PR. Let's see with the new changes if tests will keep breaking.

@abaland
Copy link
Contributor Author

abaland commented Mar 26, 2024

@rimolive I rebased the current branch onto master after the previous PR had been merged (and also pulled the most recent version of go kubernetes_platform module) but unfortunately that did not fix the tests.
At least https://github.com/abaland/pipelines/tree/feat/ephemeral_volume2 does not seem to indicate I have commits from master not included.
Did I miss something else to run an update?

@abaland abaland marked this pull request as ready for review March 26, 2024 14:46
@rimolive
Copy link
Member

rimolive commented Apr 3, 2024

/retest

@Tomcli
Copy link
Member

Tomcli commented Apr 3, 2024

@abaland Can you rebase to the latest commit to include the new API client code? Thanks

…d support to backend

Signed-off-by: abaland <abaland@indeed.com>
…d support to backend

Signed-off-by: abaland <abaland@indeed.com>
@abaland
Copy link
Contributor Author

abaland commented Apr 3, 2024

Just did 🤞

Signed-off-by: abaland <abaland@indeed.com>
Signed-off-by: abaland <abaland@indeed.com>
Copy link

google-oss-prow bot commented Apr 4, 2024

@abaland: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipelines-samples-v2 01bd04b link false /test kubeflow-pipelines-samples-v2
kfp-kubernetes-execution-tests 01bd04b link false /test kfp-kubernetes-execution-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@abaland
Copy link
Contributor Author

abaland commented Apr 4, 2024

Ok the required tests passed this time :dance:

The samples tests failing: I tried compiling one of the kfp-kubernetes-execution-tests test case locally, uploading it to our KFP environment (running on KF 1.8 release) and running it but it did not cause any error, so i'm not sure what's going on.

the samples-v2 is having the same issues as before and using the wrong api endpoint in kfp_server_api, which seems unrelated to the current PR

@rimolive
Copy link
Member

rimolive commented Apr 4, 2024

Since they are not required tests, I believe we can just ignore
/lgtm
/assign @Tomcli @chensun

Copy link
Member

@Tomcli Tomcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chensun, Tomcli

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 3fb76a8 into kubeflow:master Apr 9, 2024
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants