-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Error when ParDo returns None for all elements in DirectRunner using more than 1 worker #23228
Comments
@Wal8800 What was the root cause for the issue? Getting the same error now under 2.41.0 and 2.42.0. |
Is there any temporary workaround until 2.44.0 get's released? |
@kdcyberdude Not a great solution, but I set the |
not solved |
This seems like a pretty big bug, the fact that I have to use your library with something way more complex like flink or dataflow. |
Was there a resolution to this bug? I am having the same issue in 2.46 |
Switching to single-threaded/single-worker mode fixed it for me |
Thanks! It works for single-threaded/single-workers for me as well. However, I would like to run my beam pipeline on multiple cores using the DirectRunner. |
This should be solved in the version 2.50.0. Thanks |
What happened?
When a custom
DoFn
can return None depending on the element and running more than 2 worker in the DirectRunner, the pipeline triggers the following error when all the element from the inputs return None in the DoFn.Error:
Example pipeline script:
Example input.txt (lines of empty string):
Example command:
Ran the example in apache beam
2.41.0
Issue Priority
Priority: 2
Issue Component
Component: runner-direct
The text was updated successfully, but these errors were encountered: