Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sdk): fix upload_pipeline when no pipeline name is provided #8695

Merged
merged 1 commit into from
Jan 31, 2023

Conversation

tarat44
Copy link
Contributor

@tarat44 tarat44 commented Jan 18, 2023

Description of your changes:
This PR fixes a bug where pipelines submitted through the client using upload_pipeline that don't have a name passed to the method are not named correctly.

Checklist:

@google-cla
Copy link

google-cla bot commented Jan 18, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@google-oss-prow
Copy link

Hi @tarat44. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@connor-mccarthy
Copy link
Member

/ok-to-test

Copy link
Member

@connor-mccarthy connor-mccarthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, @tarat44.

Can you fill out the CLA?
You can ignore kubeflow-pipelines-sdk-execution-tests failures -- they are unrelated to your changes.

@@ -1373,7 +1373,7 @@ def upload_pipeline(
"""
if pipeline_name is None:
pipeline_name = os.path.splitext(
os.path.basename('something/file.txt'))[0]
os.path.basename(pipeline_package_path))[0]
Copy link
Member

@connor-mccarthy connor-mccarthy Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find!

It looks like validate_pipeline_resource_name raises on underscores. This would be a problem for a path like pipelines/two_step_pipeline.yaml, for example. Note that if you remove the validation function it uploads just fine, but you cannot create a run with the pipeline (BE throws a validation error). (I think perhaps the BE should throw a 400 if the user tries to upload a pipeline for which they could not later create a run due to an invalid name. cc: @Linchin, @gkcalat, @chensun)

While it's not the intent of the existing (errored) logic, I think a possibly better option than using the file name is to use the name already assigned to the pipeline in IR (example).

This approach ensures that if an author creates a pipeline with @dsl.pipeline(name='training-pipeline'), for example, the name will be consistent in the UI. In a sense, this makes the pipeline_name parameter of Client.upload_pipeline an "override" parameter.

It appears that this is (sort of) what happens anyway on the BE. If you submit a pipeline with pipeline_name='overridden-pipeline-name', it will assign it to pipelineInfo.name in the IR YAML (with an added pipeline/ prefix):

image

This is confusing. In short:

  • I think there is some inconsistent behavior between the KFP SDK and KFP BE.
  • When pipeline_name is None, I think we should consider using the name already assigned to the pipeline in IR YAML instead of the file name.

WDYT?

Copy link
Contributor Author

@tarat44 tarat44 Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this idea makes a lot of sense because if the error is thrown and the user does not pass in the name specifically, they might be a bit confused about where the name came from and how to fix it. I would be happy to implement the new approach if it sounds like a good approach to everyone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me 👍🏼

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connor-mccarthy, I looked into this implementation further, and I think the backend is enforcing setting the name of the pipeline to the package name when there is no name provided https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/server/pipeline_upload_server.go#L113 https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/server/util.go#L63 and the IR spec is parsed later on https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/resource/resource_manager.go#L264. It seems like the best long-term solution would be to have the upload pipeline backend use the IR spec name if the uploaded file is an IR spec and no name query param is provided. However, since the api v2 is still in active progress, would it make sense for me to have the sdk parse the yaml spec and send the name to the api request call for now? The api changes seem pretty straightforward and I'm happy to do them, but wanted to ask first in case this implementation would add complications to larger api changes in progress.

Copy link
Member

@connor-mccarthy connor-mccarthy Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on all of that. Thanks for being mindful of the BE development. The KFP SDK implementation is a good short term option and is preferred over the current SDK implementation, so sounds good if you'd like to proceed with that.

However, I agree that in the long run this should live on the BE, otherwise SDK requests contain duplicate information (i.e., creates coupling) that need to be maintained.

The api changes seem pretty straightforward and I'm happy to do them, but wanted to ask first in case this implementation would add complications to larger api changes in progress.

If it is easy and non-disruptive to add these changes to the BE, let's do that. If it's disruptive, let's start with the SDK changes.

@gkcalat, would it be disruptive for @tarat44 to add the following logic to the BE's upload pipeline handling?

BE accepts a null value in the request's name field and, when this happens, the BE will extract and use the pipeline name in the PipelineSpec message as the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connor-mccarthy @gkcalat, I went ahead with making the changes to the sdk for now but happy to pivot to the backend if desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tarat44 and @connor-mccarthy! Thank you for your contribution!
SGTM on implementing this feature in the backend.

The name of an uploaded pipeline is set here. Feel free to propose a change. Please, make sure to adjust the unit tests if we have any that check the old naming logic. Once ready, please ping us.

Note, as we are actively working on making 2.0.0-beta.0 release, your PR will probably be merged after the release. And you may need to rebase your change if you create it before the release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you for the guidance

Copy link
Member

@connor-mccarthy connor-mccarthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @tarat44. Thank you!

/lgtm
/approve

@connor-mccarthy
Copy link
Member

connor-mccarthy commented Jan 30, 2023

One small thing: could you add a release note to RELEASE.md so that the community knows about this contribution?

This should probably live under "Bug fixes and other changes".

@connor-mccarthy
Copy link
Member

/remove-approve

@tarat44
Copy link
Contributor Author

tarat44 commented Jan 31, 2023

One small thing: could you add a release note to RELEASE.md so that the community knows about this contribution?

This should probably live under "Bug fixes and other changes".

@connor-mccarthy , thank you for the review. I added the release note.

Copy link
Member

@connor-mccarthy connor-mccarthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @tarat44! Could you add the release note in the same style as the other entries, with a link to the PR?

@google-oss-prow
Copy link

@tarat44: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipelines-sdk-execution-tests 2d66b52 link true /test kubeflow-pipelines-sdk-execution-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tarat44
Copy link
Contributor Author

tarat44 commented Jan 31, 2023

Thanks, @tarat44! Could you add the release note in the same style as the other entries, with a link to the PR?

@connor-mccarthy , updated the entry. Please re-review when you have time

@connor-mccarthy
Copy link
Member

Thank you again, @tarat44!

/lgtm
/approve

@google-oss-prow google-oss-prow bot added the lgtm label Jan 31, 2023
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: connor-mccarthy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@connor-mccarthy
Copy link
Member

Manually merging. kubeflow-pipelines-sdk-execution-tests failures are unrelated to these changes.

@connor-mccarthy connor-mccarthy merged commit e50f403 into kubeflow:master Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants