-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sdk): avoid conflicting component names in DAG when reusing pipelines #11071
base: master
Are you sure you want to change the base?
fix(sdk): avoid conflicting component names in DAG when reusing pipelines #11071
Conversation
Signed-off-by: Stijn Tratsaert <stijn.tratsaert.it@gmail.com>
Signed-off-by: Stijn Tratsaert <stijn.tratsaert.it@gmail.com>
Signed-off-by: Stijn Tratsaert <stijn.tratsaert.it@gmail.com>
Hi @stijntratsaertit. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
@stijntratsaertit: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
delimiter='-') | ||
old_name_to_new_name[old_component_name] = new_component_name | ||
|
||
ordered_names = enumerate(old_name_to_new_name.items()) | ||
lifo_ordered_names = sorted(ordered_names, key=lambda x: x[0], reverse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LIFO ordering for renaming might not appropriately handle the component references, especially if the pipeline structure doesn't align with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the pipeline structure have to align with the renaming approach? LIFO seems crucial here as you want the most complex names (the last in order) to be renamed first to avoid renaming/conflicting with the more generic names.
old_name_to_new_name = {} | ||
for component_name, component_spec in sub_pipeline_spec.components.items(): | ||
existing_main_comp_names = list(main_pipeline_spec.components.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SDK execution tests are failing: https://github.com/kubeflow/pipelines/actions/runs/10228605366/job/28369638777?pr=11071
https://github.com/kubeflow/pipelines/blob/master/test/sdk-execution-tests/sdk_execution_tests.py#L114
Could you please take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last I heard about this test, this was said. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon inspecting the logs more in depth, it seems logic that this test would fail is this renaming logic has been updated. Do you think it is appropriate to update the .yaml result with the new configuration?
new_component_name = utils.make_name_unique_by_adding_index( | ||
name=component_name, | ||
collection=list(main_pipeline_spec.components.keys()), | ||
collection=existing_main_comp_names + current_comp_name_collection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_name_unique_by_adding_index
may not ensure complete uniqueness of component names when used within nested or reused pipelines. This could result in naming conflicts if the names are not correctly indexed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect it does when passing every component to the collection instead of just the components from the main pipeline. Could you elaborate on other cases I'm missing out on?
No updates? |
Description of your changes:
Up to date, properly following the contributor's guide copy of PR #9969.
This pull request addresses the issue of ensuring unique component names when merging component specifications from a sub-pipeline into a main pipeline configuration. The changes ensure that each component in the merged pipeline has a unique name, thus preventing conflicts and collisions that can occur when components from sub-pipelines are integrated into the main pipeline.
Checklist: