-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow override beam version for PythonExternalTransform via pipeline option #31691
Conversation
R: @chamikaramj |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
You can override this whole setup by starting up an expansion service manually and specifying it via the Line 98 in 2f51718
Also, note that for this to work end-to-end, the |
Thanks for pointing that. In the case of release candidate validation, e.g., GoogleCloudPlatform/DataflowTemplates#1681 (comment) it would be great to require minimum change (like flag flip) to be able to run the same workflow using release candidate vs. released version. Manual expansion service in theory works, but it then no longer validate the components of auto spin up a expansion service during pipeline expansion, which is how external transform works by default with released beam.
We do publish 2.xx.0RC1 container to dockerhub so it should be valid. |
Agree that it's better if we can use the auto-started expansion service for validation. That would be the closest to what most customers will use. So LGTM :) |
(please run a test to make sure that this works end-to-end for RCs) |
Tested. Steps: Patched v2.57.0 with this PR:
Then created a standalone project, copy-pasted PythonExternalTransformTest.trivialPythonTransform Direct runner (pipeline expansion succeeded, pipeline execution has run time error, expected):
Dataflow runner (pipeline option "--customBeamRequirement=2.56.0") (2.57.0RC1 is already deleted from PyPI)
Dataflow worker log
and pipeline run succeeded. |
Fix #31680
This was exposed in bootstrap_beam_venv.py, incuding the ability to send a tarball:
beam/sdks/java/extensions/python/src/main/resources/org/apache/beam/sdk/extensions/python/bootstrap_beam_venv.py
Lines 70 to 71 in 90d3f8a
However, it is not exposed in PythonExternalTransform that uses it.
Tested by manually adding a test (after adding direct-java dependency in test runtime classpath)
then there is log
however it is hard to unit testing as it either requires to communicate to the real pypi service, or have a beam sdk python tarball in place.
Please add a meaningful description for your change here
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.