Fix lost signal from Selector when Default path blocks #1682

yuandrew · 2024-10-22T21:34:30Z

What was changed

Fixed a lost signal, gated behind a new SdkFlag. Currently the flag is added, but never set anywhere, so existing behavior is unchanged. A separate PR will be added to enable the flag.

Why?

We shouldn't be losing a signal in our selector

Checklist

#1624

How was this tested:

Added tests

… writing tests

internal/internal_coroutines_test.go

…o selector-signal-loss

test/replaytests/workflows.go

internal/internal_workflow.go

Quinn-With-Two-Ns · 2024-11-01T16:39:11Z

internal/internal_workflow.go

-					readyBranch = func() {
+					// readyBranch is not executed when AddDefault is specified,
+					// setting the value here prevents the signal from being dropped
+					dropSignalFlag := getWorkflowEnvironment(ctx).GetFlag(SDKFlagBlockedSelectorSignalReceive)


What test did you add that test the true path here?

TestSelectBlockingDefaultWithFlag tests the true scenario

ah I see since the test environment always assumes the flag is true, hm I am not sure if we should assume that since the test environment would diverge from the real workflow environment, what do you think?

I agree that the test environment shouldn't assume flags are true by default. That seems like a change that belongs in its own separate PR, I can create an issue for this

To be clear I don't think SDK flags should be off by default in the test environment, I meant SDK flags used in the test environment should match what we enable when running a new workflow.

Oops, I misunderstood, I agree.

Is there anywhere flags are set by default for a workflow? From looking at the code, it seems like TryUse are scattered around the code for different scenarios.

Is there anywhere flags are set by default for a workflow? From looking at the code, it seems like TryUse are scattered around the code for different scenarios.

Yeah so how you can think of that is for new workflows all current flags are set to true. However for the new flag that you are adding here we don't want to set it to true by default because that would make it difficult for users to rollback their SDK version.

Quinn-With-Two-Ns · 2024-11-01T17:14:55Z

internal/internal_coroutines_test.go

+	require.EqualValues(t, expected, history)
+}
+
+func TestSelectBlockingDefaultWithFlag(t *testing.T) {


The reported bug was that blocking in the default case of a selector could cause signals to be lost, when I last looked at these tests we didn't seem to have any coverage for blocking in one selector case while a signal is received. Can we add tests to verify their is no bugs if a signal is received while blocking in another case of a selector, not just default?

Quinn-With-Two-Ns · 2024-11-04T22:25:12Z

Do you think it would be feasibly to add some debug API to set the flag to true and an integration test + replay test that tests a real workflow with the flag set?

Quinn-With-Two-Ns · 2024-11-14T21:35:32Z

test/workflow_test.go

+	ch2 := workflow.NewChannel(ctx)
+
+	if enableFlag {
+		internal.SetUnblockSelectorSignal()


I think we should do this as part of the test, not the workflow. More importantly though can you confirm setting this won't effect any test that runs after this test?

we should do this as part of the test, not the workflow

What is there difference between the two? or is this more of a style/preference?

can you confirm setting this won't effect any test that runs after this test?

Looks like it does affect subsequent tests :( Is it enough to unset this value? Go tests run in parallel, so that seems insufficient?

What is there difference between the two? or is this more of a style/preference?

I think it is clearer when we are testing a workflow if all the setup is run before hand.

Go tests run in parallel, so that seems insufficient?

Only if t.Parallel() is called

Quinn-With-Two-Ns · 2024-11-15T15:41:03Z

test/integration_test.go

+	defer cancel()
+	options := ts.startWorkflowOptions("test-selector-block")
+
+	internal.SetUnblockSelectorSignal()


nit: could just have one function that takes a bool: SetUnblockSelectorSignal(bool)

Quinn-With-Two-Ns · 2024-11-15T15:42:43Z

LGTM! Thanks for putting up with all my requests for more tests

* initial changes, added replay test for legacy history, need to finish writing tests * Clean up tests, fix error * unit test for fixed behavior * PR feedback * improve tests, add tests for AddFuture, AddSend * add integration tests, add debug API to enable SDK flag for tests * set flag in test itself not workflow, unset flag after test * unify set/unset function into one

initial changes, added replay test for legacy history, need to finish…

9683052

… writing tests

yuandrew commented Oct 22, 2024

View reviewed changes

internal/internal_coroutines_test.go Show resolved Hide resolved

yuandrew and others added 4 commits October 25, 2024 15:04

Merge branch 'master' into selector-signal-loss

8485964

Clean up tests, fix error

d0e9246

Merge branch 'selector-signal-loss' of github.com:yuandrew/sdk-go int…

d196da2

…o selector-signal-loss

unit test for fixed behavior

ca909b6

yuandrew marked this pull request as ready for review October 28, 2024 16:50

yuandrew requested a review from a team as a code owner October 28, 2024 16:50

Quinn-With-Two-Ns reviewed Oct 28, 2024

View reviewed changes

test/replaytests/workflows.go Show resolved Hide resolved

Quinn-With-Two-Ns reviewed Oct 28, 2024

View reviewed changes

internal/internal_workflow.go Outdated Show resolved Hide resolved

PR feedback

5d2778e

yuandrew requested a review from Quinn-With-Two-Ns November 1, 2024 16:34

Quinn-With-Two-Ns reviewed Nov 1, 2024

View reviewed changes

improve tests, add tests for AddFuture, AddSend

ceadefd

add integration tests, add debug API to enable SDK flag for tests

7ad8169

Quinn-With-Two-Ns reviewed Nov 14, 2024

View reviewed changes

set flag in test itself not workflow, unset flag after test

c96c3a5

Quinn-With-Two-Ns reviewed Nov 15, 2024

View reviewed changes

Quinn-With-Two-Ns approved these changes Nov 15, 2024

View reviewed changes

yuandrew and others added 2 commits November 15, 2024 09:37

unify set/unset function into one

2c390ab

Merge branch 'master' into selector-signal-loss

4d37fb8

yuandrew merged commit c31c2f2 into temporalio:master Nov 18, 2024
13 checks passed

yuandrew deleted the selector-signal-loss branch November 18, 2024 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lost signal from Selector when Default path blocks #1682

Fix lost signal from Selector when Default path blocks #1682

yuandrew commented Oct 22, 2024 •

edited

Loading

Quinn-With-Two-Ns Nov 1, 2024

yuandrew Nov 1, 2024

Quinn-With-Two-Ns Nov 1, 2024

yuandrew Nov 1, 2024

yuandrew Nov 1, 2024

Quinn-With-Two-Ns Nov 1, 2024

yuandrew Nov 1, 2024

Quinn-With-Two-Ns Nov 4, 2024

Quinn-With-Two-Ns Nov 1, 2024

Quinn-With-Two-Ns commented Nov 4, 2024

Quinn-With-Two-Ns Nov 14, 2024

yuandrew Nov 14, 2024

Quinn-With-Two-Ns Nov 14, 2024

Quinn-With-Two-Ns Nov 15, 2024

Quinn-With-Two-Ns commented Nov 15, 2024

Fix lost signal from Selector when Default path blocks #1682

Fix lost signal from Selector when Default path blocks #1682

Conversation

yuandrew commented Oct 22, 2024 • edited Loading

What was changed

Why?

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Quinn-With-Two-Ns commented Nov 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Quinn-With-Two-Ns commented Nov 15, 2024

yuandrew commented Oct 22, 2024 •

edited

Loading