-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sdk): fix upload_pipeline when no pipeline name is provided #8695
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Hi @tarat44. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, @tarat44.
Can you fill out the CLA?
You can ignore kubeflow-pipelines-sdk-execution-tests
failures -- they are unrelated to your changes.
sdk/python/kfp/client/client.py
Outdated
@@ -1373,7 +1373,7 @@ def upload_pipeline( | |||
""" | |||
if pipeline_name is None: | |||
pipeline_name = os.path.splitext( | |||
os.path.basename('something/file.txt'))[0] | |||
os.path.basename(pipeline_package_path))[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find!
It looks like validate_pipeline_resource_name
raises on underscores. This would be a problem for a path like pipelines/two_step_pipeline.yaml
, for example. Note that if you remove the validation function it uploads just fine, but you cannot create a run with the pipeline (BE throws a validation error). (I think perhaps the BE should throw a 400 if the user tries to upload a pipeline for which they could not later create a run due to an invalid name. cc: @Linchin, @gkcalat, @chensun)
While it's not the intent of the existing (errored) logic, I think a possibly better option than using the file name is to use the name already assigned to the pipeline in IR (example).
This approach ensures that if an author creates a pipeline with @dsl.pipeline(name='training-pipeline')
, for example, the name will be consistent in the UI. In a sense, this makes the pipeline_name
parameter of Client.upload_pipeline
an "override" parameter.
It appears that this is (sort of) what happens anyway on the BE. If you submit a pipeline with pipeline_name='overridden-pipeline-name'
, it will assign it to pipelineInfo.name
in the IR YAML (with an added pipeline/
prefix):
This is confusing. In short:
- I think there is some inconsistent behavior between the KFP SDK and KFP BE.
- When
pipeline_name is None
, I think we should consider using the name already assigned to the pipeline in IR YAML instead of the file name.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this idea makes a lot of sense because if the error is thrown and the user does not pass in the name specifically, they might be a bit confused about where the name came from and how to fix it. I would be happy to implement the new approach if it sounds like a good approach to everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me 👍🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@connor-mccarthy, I looked into this implementation further, and I think the backend is enforcing setting the name of the pipeline to the package name when there is no name provided https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/server/pipeline_upload_server.go#L113 https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/server/util.go#L63 and the IR spec is parsed later on https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/resource/resource_manager.go#L264. It seems like the best long-term solution would be to have the upload pipeline backend use the IR spec name if the uploaded file is an IR spec and no name query param is provided. However, since the api v2 is still in active progress, would it make sense for me to have the sdk parse the yaml spec and send the name to the api request call for now? The api changes seem pretty straightforward and I'm happy to do them, but wanted to ask first in case this implementation would add complications to larger api changes in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on all of that. Thanks for being mindful of the BE development. The KFP SDK implementation is a good short term option and is preferred over the current SDK implementation, so sounds good if you'd like to proceed with that.
However, I agree that in the long run this should live on the BE, otherwise SDK requests contain duplicate information (i.e., creates coupling) that need to be maintained.
The api changes seem pretty straightforward and I'm happy to do them, but wanted to ask first in case this implementation would add complications to larger api changes in progress.
If it is easy and non-disruptive to add these changes to the BE, let's do that. If it's disruptive, let's start with the SDK changes.
@gkcalat, would it be disruptive for @tarat44 to add the following logic to the BE's upload pipeline handling?
BE accepts a null value in the request's name
field and, when this happens, the BE will extract and use the pipeline name in the PipelineSpec message as the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@connor-mccarthy @gkcalat, I went ahead with making the changes to the sdk for now but happy to pivot to the backend if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tarat44 and @connor-mccarthy! Thank you for your contribution!
SGTM on implementing this feature in the backend.
The name of an uploaded pipeline is set here. Feel free to propose a change. Please, make sure to adjust the unit tests if we have any that check the old naming logic. Once ready, please ping us.
Note, as we are actively working on making 2.0.0-beta.0 release, your PR will probably be merged after the release. And you may need to rebase your change if you create it before the release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thank you for the guidance
9d5ddfa
to
ccd6c5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, @tarat44. Thank you!
/lgtm
/approve
One small thing: could you add a release note to This should probably live under "Bug fixes and other changes". |
/remove-approve |
ccd6c5f
to
2d66b52
Compare
@connor-mccarthy , thank you for the review. I added the release note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @tarat44! Could you add the release note in the same style as the other entries, with a link to the PR?
@tarat44: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
2d66b52
to
6b61584
Compare
@connor-mccarthy , updated the entry. Please re-review when you have time |
Thank you again, @tarat44! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: connor-mccarthy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Manually merging. |
Description of your changes:
This PR fixes a bug where pipelines submitted through the client using
upload_pipeline
that don't have a name passed to the method are not named correctly.Checklist: