-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lost signal from Selector when Default path blocks #1682
Conversation
internal/internal_workflow.go
Outdated
readyBranch = func() { | ||
// readyBranch is not executed when AddDefault is specified, | ||
// setting the value here prevents the signal from being dropped | ||
dropSignalFlag := getWorkflowEnvironment(ctx).GetFlag(SDKFlagBlockedSelectorSignalReceive) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What test did you add that test the true
path here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestSelectBlockingDefaultWithFlag
tests the true
scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see since the test environment always assumes the flag is true, hm I am not sure if we should assume that since the test environment would diverge from the real workflow environment, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the test environment shouldn't assume flags are true by default. That seems like a change that belongs in its own separate PR, I can create an issue for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear I don't think SDK flags should be off by default in the test environment, I meant SDK flags used in the test environment should match what we enable when running a new workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I misunderstood, I agree.
Is there anywhere flags are set by default for a workflow? From looking at the code, it seems like TryUse
are scattered around the code for different scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anywhere flags are set by default for a workflow? From looking at the code, it seems like TryUse are scattered around the code for different scenarios.
Yeah so how you can think of that is for new workflows all current flags are set to true. However for the new flag that you are adding here we don't want to set it to true by default because that would make it difficult for users to rollback their SDK version.
require.EqualValues(t, expected, history) | ||
} | ||
|
||
func TestSelectBlockingDefaultWithFlag(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reported bug was that blocking in the default case of a selector could cause signals to be lost, when I last looked at these tests we didn't seem to have any coverage for blocking in one selector case while a signal is received. Can we add tests to verify their is no bugs if a signal is received while blocking in another case of a selector, not just default?
Do you think it would be feasibly to add some debug API to set the flag to true and an integration test + replay test that tests a real workflow with the flag set? |
test/workflow_test.go
Outdated
ch2 := workflow.NewChannel(ctx) | ||
|
||
if enableFlag { | ||
internal.SetUnblockSelectorSignal() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do this as part of the test, not the workflow. More importantly though can you confirm setting this won't effect any test that runs after this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do this as part of the test, not the workflow
What is there difference between the two? or is this more of a style/preference?
can you confirm setting this won't effect any test that runs after this test?
Looks like it does affect subsequent tests :( Is it enough to unset this value? Go tests run in parallel, so that seems insufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is there difference between the two? or is this more of a style/preference?
I think it is clearer when we are testing a workflow if all the setup is run before hand.
Go tests run in parallel, so that seems insufficient?
Only if t.Parallel() is called
test/integration_test.go
Outdated
defer cancel() | ||
options := ts.startWorkflowOptions("test-selector-block") | ||
|
||
internal.SetUnblockSelectorSignal() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could just have one function that takes a bool: SetUnblockSelectorSignal(bool)
LGTM! Thanks for putting up with all my requests for more tests |
* initial changes, added replay test for legacy history, need to finish writing tests * Clean up tests, fix error * unit test for fixed behavior * PR feedback * improve tests, add tests for AddFuture, AddSend * add integration tests, add debug API to enable SDK flag for tests * set flag in test itself not workflow, unset flag after test * unify set/unset function into one
What was changed
Fixed a lost signal, gated behind a new SdkFlag. Currently the flag is added, but never set anywhere, so existing behavior is unchanged. A separate PR will be added to enable the flag.
Why?
We shouldn't be losing a signal in our selector
Checklist
#1624
Added tests