-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2 compiler has data passing limitations #5711
Comments
This won't work. The v2 compiler imposes severe limitations on the A fix for this v2 compiler issue was proposed and implemented in #5478 Please take a look. |
I'll take a look, cheers! For reference, my version that uses a local component with this change does compile, and works as expected. I imagine you'd run into trouble if you wanted to upload an artifact rather than a string, but as most of my pipelines are being ported from v1 they don't use artifact types (at least not intentionally: a lot of previously untyped parameters started being handled that way). |
With the current state of the "v2 compiler", you'll need two separate uploader components - one for uploading strings (String-typed outputs) and another for uploading non-strings... |
KFP v1KFP has simple and easy data passing model that users like. There are different kinds of arguments and different ways to consume an input argument. All combinations working. Any input can be consumed
And the argument for any input can come from
Combining these options there are 6 end-to-end data passing cases, each of which works regardless of types:
The only way types come into play is that when both upstream output and downstream input have types, the types must match. Current KFP v2 compiler limitations:The current version of the v2 compiler adds numerous data-passing limitations. Most limitations can easily be lifted (see #5478), but the compiler team wants to hear more user feedback.
|
KFP v2 has a different data model from KFP v1. In KFP v2, we have the distinction between parameters and artifacts. They have different behaviors. For instance, parameters are stored in MLMD, and they can be queried, while for artifacts, we don't store the file content in MLMD, but artifacts can have additional metadata. I consider this as a feature improvement over the KFP v1 data model. This doc describes which types are treated as parameters and which are treated as artifacts. We have a simple rule of how parameters vs. artifacts are decided. The decision should be deterministic, but not based on how a component is used. The proposed fix (#5478) you mentioned above would infer user's intention in specific scenarios. As we've discussed offline, it could lead to inconsistent behavior and the team decided to not move forward with it. I believe most of the related issues user may have can simply be fixed by adding/updating types. And we need to update our samples and components to make them compatible as well (tracked by #5801). |
Hi @jackwhelpton, as we spoke last time on this issue, I believe your use case is unblocked by adding/updating the type. If there are no other blockers, do you think we can close this issue? Thanks! |
It seems like everything I'd wanted from this ticket is covered by #5801, so I'm fine with closing this one. Thanks! |
Thanks Jack. Closing this issue. |
/reopen let's keep this open |
@Bobgy: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
#6080 was meant to be temporary in the first place (I propose to revert it by KFP SDK 2.0 release). |
Environment
kfp 1.6.2
kfp-pipeline-spec 0.1.7
kfp-server-api 1.5.0
Steps to reproduce
Use the
upload_to_explicit_url
component in a pipeline (picking a version that includes the latest v2 updates):As the
Data
input is untyped, it is treated as an artifact, and compiling produces this error:The fix appears straightforward, to include the
String
type on theData
input as well as theGCS path
input and output.Expected result
Pipeline compiles successfully.
Materials and Reference
https://raw.githubusercontent.com/kubeflow/pipelines/961b17fa6844e1d79e5d3686bb557d830d7b5a95/components/google-cloud/storage/upload_to_explicit_uri/component.yaml
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: