Handle updates immediately when registered #1306

Quinn-With-Two-Ns · 2023-11-28T17:11:35Z

Handle updates immediately when registered and give priority to update request over the main workflow coroutine if a handler is registered. Now when no update handlers have been registered instead of yielding the update coroutine we queue the update request and only create the coroutine when the handler is registered or there is no more coroutines to run. All update coroutines are now created at the front of the scheduler so they run before the root coroutine.

This change is a major change to how the Go SDK handles updates, it should be backwards compatible in the sense old workflows should continue to run, but new workflows will behave observably different to users.

closes #1297

Quinn-With-Two-Ns · 2023-11-28T19:23:40Z

test/workflow_test.go

+		inflightUpdates++
+		updatesRan++
+		err := workflow.Sleep(ctx, time.Second)
+		inflightUpdates--


avoiding using defer here because #1235

internal/internal_event_handlers.go

dandavison · 2023-11-29T23:14:20Z

internal/internal_event_handlers.go

+		for _, u := range us {
+			u()
+		}
+		rerun = true


Is it definitely true that us was non-empty?

Yes, but i'll admit that isn't obvious. We could move the line into the range block to cover that case.

internal/internal_event_handlers.go

dandavison · 2023-11-29T23:19:06Z

internal/internal_workflow.go

 	// Keep executing until at least one goroutine made some progress
-	for !allBlocked {
+	for !allBlocked || d.allBlockedCallback() {


nit: this reads a bit oddly ("If not allblocked or allblocked"). I'm trying to think of a better name for the callback. Maybe something like tryAdvance()?

Sushisource

This makes sense to me overall. Just a few comments but nothing major.

internal/internal_event_handlers.go

internal/internal_workflow.go

internal/internal_update_test.go

Sushisource · 2023-11-30T00:16:50Z

internal/internal_event_handlers.go

+	// Check if any blocked updates remain when we have no more coroutines to run and let them run so they are rejected.
+	// Generally iterating a map in workflow code is bad because it is non deterministic
+	// this case is fine since all these update handles will be rejected and not recorded in history.


It wasn't immediately clear to me that rejection would happen because without the context of what the functions in blockedUpdates really are (wrappers around the update that will also reject if no one registered, not just the user handler) it seems like this might be executing handlers directly.

I'm not sure there's really any great way to deal with that, though.

It's also not immediately obvious that it's guaranteed that all updates will be rejected. I guess this is the case because, if the updates were registered, then they would have had their handlers run before all the coroutines block.

That being the case - perhaps it would make more sense for this function to be called RejectUnhandledUpdates.

Maybe it could even call a function that explicitly rejects the update, rather than the entire handler. I don't feel strongly about that though.

Yeah I can document more clearly blockedUpdates doesn't contain the handlers directly, but spawns a coroutine to handle them.

It's also not immediately obvious that it's guaranteed that all updates will be rejected. I guess this is the case because, if the updates were registered, then they would have had their handlers run before all the coroutines block.

Yeah it is up to the handler to reject, I'd rather keep the rejection logic all in the handler rather then sprinkle it around.

That being the case - perhaps it would make more sense for this function to be called RejectUnhandledUpdates

I didn't name it that because I thought if I added a dynamic update handle then they would be handled here, we don't support dynamic anything in Go yet so maybe that isn't a good reason.

internal/internal_workflow.go

dandavison · 2023-11-30T02:13:46Z

internal/internal_workflow.go

@@ -1123,8 +1134,10 @@ func (d *dispatcherImpl) ExecuteUntilAllBlocked(deadlockDetectionTimeout time.Du
 			}
 		}


In Typescript, the Update handler preempts immediately the code that was executing the setHandler call, whereas here we're going to let all outstanding coroutines run until they're blocked/exit, and only then turn to the queued eager Update handler. I think that's going to result in (subtly) different semantics, e.g. related to whether or not an Update handler will see changes made by other concurrent routines. I'm not necessarily suggesting a change; that may fall well within expected inter-SDK variability. But, it would be technically possible I think to execute the pending eager coroutines inside the inner for loop, rather than waiting for that loop to exit (and I suspect it would bring Typescript and Go semantics closer together, but I haven't done the work to come up with an example.)

Yes no coroutine in the Go SDK executes immediately when created. The same is true if users create there own coroutine or resolve a future. The runtimes schedule work differently. I agree it is probably technically possible to make the updates more "eager", but it isn't required to get the behavior WRT updates running before the main workflow function and allowing updates to run after being registered before the main function is finished.

It isn't required to get the behavior WRT updates running before the main workflow function and allowing updates to run after being registered before the main function is finished.

Hm, I think that's not true. Our requirements for signal and update hold that pending signals and updates must get an opportunity to influence the workflow return value, and the params passed to CAN. But the workflow return value may be computed by one of the pending coroutines in the inner loop. Similarly, one of the pending coroutines in the inner loop could be a signal or update handler that CANs.

I'm thinking that we can meet that requirement by ensuring that, when we yield back to the executor loop in setUpdateHandler, the new handler coroutine is the very next coroutine to be handled.

Here's a failing test that demonstrates what I'm saying. I think that because your test workflows are using workflow.Await before the end, they are passing, but that we actually have a stronger requirement.

func (w *Workflows) UpdateSetHandlerOnly(ctx workflow.Context) (int, error) { updatesRan := 0 updateHandle := func(ctx workflow.Context) error { updatesRan++ return nil } workflow.SetUpdateHandler(ctx, "update", updateHandle) return updatesRan, nil } func (ts *IntegrationTestSuite) TestUpdateAlwaysHandled() { ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() options := ts.startWorkflowOptions("test-update-always-handled") options.StartDelay = time.Hour run, err := ts.client.ExecuteWorkflow(ctx, options, ts.workflows.UpdateSetHandlerOnly) ts.NoError(err) // Send an update before the first workflow task _, err = ts.client.UpdateWorkflow(ctx, run.GetID(), run.GetRunID(), "update") ts.NoError(err) var result int ts.NoError(run.Get(ctx, &result)) ts.Equal(1, result) }

The equivalent test in Typescript passes: https://github.com/dandavison/temporalio-sdk-typescript/blob/125241bbd3a1b3cf29e59d47eaa0e979ad3d7e69/packages/test/src/test-integration-update.ts#L217-L243

This test passes with my changes

I added this test as part of the PR since I think it is a good thing to test

Oh sorry, moving too fast, I thought that failed. OK, so I I'd have to think more about whether one can construct a situation where a coroutine in the inner loop can CAN with a value that should be influenced by a pending eager coroutine. Can you see that that is impossible?

No worries, I don't see how CAN is different then returning any normal value from the workflow function

dandavison

This is great and it's been teaching about sdk-go. I'm currently thinking that we do need to implement maximally eager scheduling in order to uphold the requirements we've agreed upon (explained in comment).

test/integration_test.go

dandavison · 2023-11-30T15:13:57Z

internal/internal_workflow.go

@@ -1123,8 +1134,10 @@ func (d *dispatcherImpl) ExecuteUntilAllBlocked(deadlockDetectionTimeout time.Du
 			}
 		}


It isn't required to get the behavior WRT updates running before the main workflow function and allowing updates to run after being registered before the main function is finished.

Hm, I think that's not true. Our requirements for signal and update hold that pending signals and updates must get an opportunity to influence the workflow return value, and the params passed to CAN. But the workflow return value may be computed by one of the pending coroutines in the inner loop. Similarly, one of the pending coroutines in the inner loop could be a signal or update handler that CANs.

I'm thinking that we can meet that requirement by ensuring that, when we yield back to the executor loop in setUpdateHandler, the new handler coroutine is the very next coroutine to be handled.

Here's a failing test that demonstrates what I'm saying. I think that because your test workflows are using workflow.Await before the end, they are passing, but that we actually have a stronger requirement.

func (w *Workflows) UpdateSetHandlerOnly(ctx workflow.Context) (int, error) { updatesRan := 0 updateHandle := func(ctx workflow.Context) error { updatesRan++ return nil } workflow.SetUpdateHandler(ctx, "update", updateHandle) return updatesRan, nil } func (ts *IntegrationTestSuite) TestUpdateAlwaysHandled() { ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() options := ts.startWorkflowOptions("test-update-always-handled") options.StartDelay = time.Hour run, err := ts.client.ExecuteWorkflow(ctx, options, ts.workflows.UpdateSetHandlerOnly) ts.NoError(err) // Send an update before the first workflow task _, err = ts.client.UpdateWorkflow(ctx, run.GetID(), run.GetRunID(), "update") ts.NoError(err) var result int ts.NoError(run.Get(ctx, &result)) ts.Equal(1, result) }

The equivalent test in Typescript passes: https://github.com/dandavison/temporalio-sdk-typescript/blob/125241bbd3a1b3cf29e59d47eaa0e979ad3d7e69/packages/test/src/test-integration-update.ts#L217-L243

internal/internal_flags.go

dandavison · 2023-12-04T16:15:51Z

internal/internal_coroutines_test.go

+	err := d.ExecuteUntilAllBlocked(defaultDeadlockDetectionTimeout)
+	require.NoError(t, err)
+	require.True(t, d.IsDone())
+	require.Equal(t, []string{"root", "root yield start", "outer eager coroutine", "inner eager coroutine", "coroutine 1", "coroutine 2", "root yield finish"}, history)


Nice. Just for the record, as a faithless reviewer, I have double-checked that this fails without the latest updates to the executor loop.

dandavison · 2023-12-04T16:21:09Z

Great, that second modification to the loop was pretty clean. So my only remaining question here is about backwards compatibility: suppose that we decide that we are interested in guaranteeing that updates always execute in the order supplied in the WFT, rather than sometimes executing in the order of handler registration. If we were to release this now, would that make such work more difficult than it would be without having released this change?

Quinn-With-Two-Ns · 2023-12-04T16:34:00Z

So my only remaining question here is about backwards compatibility: suppose that we decide that we are interested in guaranteeing that updates always execute in the order supplied in the WFT, rather than sometimes executing in the order of handler registration. If we were to release this now, would that make such work more difficult than it would be without having released this change?

No issue with backwards compatibility. The reason is that history will record the messages in the order the SDK processed the updates not the order supplied in the WFT. So when replaying the order processed is the order supplied.

dandavison

LGTM!

Handle updates immediately when registered. Handle updates before the root coroutine.

Quinn-With-Two-Ns commented Nov 28, 2023

View reviewed changes

Quinn-With-Two-Ns marked this pull request as ready for review November 28, 2023 19:23

Quinn-With-Two-Ns requested a review from a team as a code owner November 28, 2023 19:23

Quinn-With-Two-Ns requested review from dandavison and cretz November 28, 2023 19:24

dandavison reviewed Nov 29, 2023

View reviewed changes

Sushisource reviewed Nov 30, 2023

View reviewed changes

dandavison reviewed Nov 30, 2023

View reviewed changes

internal/internal_workflow.go Outdated Show resolved Hide resolved

dandavison reviewed Nov 30, 2023

View reviewed changes

dandavison requested changes Nov 30, 2023

View reviewed changes

Quinn-With-Two-Ns force-pushed the issue-1297 branch from a5e6c73 to c43c47f Compare November 30, 2023 16:51

Sushisource approved these changes Nov 30, 2023

View reviewed changes

cretz approved these changes Dec 1, 2023

View reviewed changes

internal/internal_flags.go Show resolved Hide resolved

Quinn-With-Two-Ns force-pushed the issue-1297 branch from 7c50575 to 3f57f53 Compare December 2, 2023 01:42

Quinn-With-Two-Ns requested a review from dandavison December 2, 2023 01:49

dandavison reviewed Dec 4, 2023

View reviewed changes

dandavison approved these changes Dec 4, 2023

View reviewed changes

Quinn-With-Two-Ns requested review from cretz and Sushisource December 4, 2023 16:57

cretz approved these changes Dec 4, 2023

View reviewed changes

Quinn-With-Two-Ns added 8 commits December 4, 2023 11:12

Handle updates immediately when registered

146b855

Handle updates immediately when registered. Handle updates before the root coroutine.

Fix errcheck

26e5d58

Disable parallel tests on TestDefaultUpdateHandler

46c2e4a

Remove defers

6855e3a

Respond to PR comments

5c92417

Make scheduling of update more eager

b3a0e69

run go mod tidy

9c4a1ca

Run go mod tidy everywhere

6c6ab19

Quinn-With-Two-Ns force-pushed the issue-1297 branch from 2d3382d to 6c6ab19 Compare December 4, 2023 19:12

Quinn-With-Two-Ns merged commit 5fdbecc into temporalio:master Dec 4, 2023
12 checks passed

cretz mentioned this pull request Jan 3, 2024

Always yield when registering update handles #1325

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle updates immediately when registered #1306

Handle updates immediately when registered #1306

Quinn-With-Two-Ns commented Nov 28, 2023

Quinn-With-Two-Ns Nov 28, 2023

dandavison Nov 29, 2023

Quinn-With-Two-Ns Nov 29, 2023

dandavison Nov 29, 2023

Sushisource left a comment

Sushisource Nov 30, 2023

Sushisource Nov 30, 2023

Quinn-With-Two-Ns Nov 30, 2023

Quinn-With-Two-Ns Nov 30, 2023

dandavison Nov 30, 2023

Quinn-With-Two-Ns Nov 30, 2023

dandavison Nov 30, 2023 •

edited

Loading

Quinn-With-Two-Ns Nov 30, 2023

Quinn-With-Two-Ns Nov 30, 2023

dandavison Nov 30, 2023

Quinn-With-Two-Ns Nov 30, 2023

dandavison left a comment

dandavison Nov 30, 2023 •

edited

Loading

dandavison Dec 4, 2023

dandavison commented Dec 4, 2023 •

edited

Loading

Quinn-With-Two-Ns commented Dec 4, 2023

dandavison left a comment

		@@ -1123,8 +1134,10 @@ func (d *dispatcherImpl) ExecuteUntilAllBlocked(deadlockDetectionTimeout time.Du
		}
		}

Handle updates immediately when registered #1306

Handle updates immediately when registered #1306

Conversation

Quinn-With-Two-Ns commented Nov 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sushisource left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dandavison Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dandavison left a comment

Choose a reason for hiding this comment

dandavison Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dandavison commented Dec 4, 2023 • edited Loading

Quinn-With-Two-Ns commented Dec 4, 2023

dandavison left a comment

Choose a reason for hiding this comment

dandavison Nov 30, 2023 •

edited

Loading

dandavison Nov 30, 2023 •

edited

Loading

dandavison commented Dec 4, 2023 •

edited

Loading