Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pipeline update compatibility version option. #29140

Merged
merged 7 commits into from
Dec 7, 2023

Conversation

robertwb
Copy link
Contributor

This can be used to migrate to best practices and good default with new versions of Beam while still allowing users of older SDKs to update their SDK version without breaking update compatibility.

Also add the mechanisms to propagate this option for cross-language transforms.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

This can be used to migrate to best practices and good default with new versions of Beam
while still allowing users of older SDKs to update their SDK version without breaking
update compatibility.

Also add the mechanisms to propagate this option for cross-language transforms.
@robertwb robertwb marked this pull request as ready for review October 25, 2023 19:54
@robertwb
Copy link
Contributor Author

R: @kennknowles

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

Copy link
Member

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1000000 for this concept. Worth a dev@ thread to gather ideas and advertise this new thing, and since it'll change how we develop in many cases.


// (Optional) A set of Pipeline Options that should be used
// when expanding this transform.
google.protobuf.Struct pipeline_options = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chamikaramj just checking this with you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. We've been thinking about propagating other PipelineOptions to expansion as well (which we can piggy-back on top of this feature, on a case-by-case basis).


@Description(
"If set, attempts to produce a pipeline compatible with this prior version of the Beam SDK."
+ " See https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually have some discussions on dev@ and this also is Flink and Samza. Makes sense, basically. It isn't rigorously defined, but we do OK with it. And I think it is more or less runner independent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Started a discussion .

request.pipeline_options)
# TODO(https://github.com/apache/beam/issues/20090): Figure out the
# correct subset of options to apply to expansion.
if request_options.view_as(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment re: being not just for GCP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it here as it's next to the other update options. Open to suggestions if there's a better place. (A new update options?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@je-ik
Copy link
Contributor

je-ik commented Oct 27, 2023

+1, this is great. Should we add this to the non-portable PipelineOptions as well?

@je-ik
Copy link
Contributor

je-ik commented Oct 27, 2023

And one more technical question - should the version be simply string (that would be compared lexicographically), or should we parse it somehow so that all usages of the version will not re-interpret it? I.e. there could be some issues when (and if) we reach beam 2.100.0.

@robertwb
Copy link
Contributor Author

I'm intending the version to be compared per https://semver.org/ ; I'll update the docs.

As for placement, maybe we should put it in https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/options/StreamingOptions.html I think I'll do that.

@robertwb
Copy link
Contributor Author

I've resolved the merge conflicts. Please take another look.

@robertwb
Copy link
Contributor Author

robertwb commented Dec 4, 2023

For the record, dev discussion at https://lists.apache.org/thread/29r3zv04n4ooq68zzvpw6zm1185n59m2

In summary, we could do better if we add the capability to inspect the graph at construction time, but that doesn't preclude this step forward for now. It doesn't seem there's any objection to getting this in.

@robertwb robertwb merged commit 3182273 into apache:master Dec 7, 2023
98 of 99 checks passed
JayajP pushed a commit to JayajP/beam that referenced this pull request Dec 27, 2023
This can be used to migrate to best practices and good default with new versions of Beam
while still allowing users of older SDKs to update their SDK version without breaking
update compatibility.

Also add the mechanisms to propagate this option for cross-language transforms.
Naireen pushed a commit to Naireen/beam that referenced this pull request Jan 3, 2024
This can be used to migrate to best practices and good default with new versions of Beam
while still allowing users of older SDKs to update their SDK version without breaking
update compatibility.

Also add the mechanisms to propagate this option for cross-language transforms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants