-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround for updates delivered in first WFT #1026
Conversation
If the first WFT includes an update we can have a problem because the workflow function itself hasn't run yet to register update handlers and the update will be rejected. Adding this yield allows the scheduler to execute the workflow function up to the first blocking point, at which point the handler(s) will be registered by user code.
I don't love this - it's a little brittle and dependent on current scheduler design - but I haven't been able to figure out a better/cleaner way to achieve the same thing. Would love to hear some other suggestions. |
// delivered before the workflow function itself has run and had a | ||
// chance to register update handlers) then we yield control back to the | ||
// scheduler to allow handler registration to occur. The sceduler will | ||
// resume this coroutine after others have run to a blocking point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm. I am racking my brain trying to think of a downside to this. We will have to discuss what update-with-start looks like I think (just whether the update event coming before the workflow start event is even an issue).
I can't think of any downsides to this off the top of my head. Even if there are multiple updates before workflow start I think a yield from each is safe. And I don't think yielding in every case of no handlers no matter how far into the workflow we are is harmful either (just pushing this to the end of the loop). Is there any chance this yield could not get completed before next WFT? I know there's some "run until all yielded" code/logic before commands are sent as a WFT completion, so I want to confirm that this "yield" is not considered waiting on Temporal like the other yields.
I would like to request a update-before-worker-start integration test in the features repo (though all other languages offer pre-start handler registration unlike Go).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside for me is that this aspect of update delivery (admittedly an edge-case) becomes dependent on the precise mechanism of how workflow execution happens and it does so in a somewhat invisible way. In particular there is an expectation encoded here that during startup the root coro always gets scheduler slot zero and that it yields exactly once before running the user-supplied workflow code.
As for the update-before-start integration test ... that's actually all of them! Prior to this PR, I always had to time.Sleep(1*time.Second) between launching the workflow and submitting the first update. Without that sleep, the first update sees an error returned "no registered handler for 'foo'" With this change, that sleep can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed one of your questions...
Is there any chance this yield could not get completed before next WFT? I know there's some "run until all yielded" code/logic before commands are sent as a WFT completion, so I want to confirm that this "yield" is not considered waiting on Temporal like the other yields.
What I found is that yield is lighter than the blocking that you might otherwise see (e.g. with workflow.Sleep(ctx, 1)) in that the yielded coroutine is not considered to have blocked and thus stays in a runnable state within the current call to ExecuteUntilAllBlocked on the coro dispatcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside for me is that this aspect of update delivery (admittedly an edge-case) becomes dependent on the precise mechanism of how workflow execution happens and it does so in a somewhat invisible way.
This is the accepted cost of Go SDK's inability to predefine anything about a workflow before starting it. No worries (much cleaner in other langs)
is not considered to have blocked [...] within the current call to ExecuteUntilAllBlocked
Perfect, thanks
Co-authored-by: Chad Retz <chad.retz@gmail.com>
// delivered before the workflow function itself has run and had a | ||
// chance to register update handlers) then we yield control back to the | ||
// scheduler to allow handler registration to occur. The sceduler will | ||
// resume this coroutine after others have run to a blocking point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside for me is that this aspect of update delivery (admittedly an edge-case) becomes dependent on the precise mechanism of how workflow execution happens and it does so in a somewhat invisible way.
This is the accepted cost of Go SDK's inability to predefine anything about a workflow before starting it. No worries (much cleaner in other langs)
is not considered to have blocked [...] within the current call to ExecuteUntilAllBlocked
Perfect, thanks
// chance to register update handlers) then we yield control back to the | ||
// scheduler to allow handler registration to occur. The scheduler will | ||
// resume this coroutine after others have run to a blocking point. | ||
if len(eo.updateHandlers) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few of concerns here:
-
What are the delivery ordering semantics after adding this yield?
What if we have a signal-update-signal sequence?
With this solution, will the update necessarily be delivered after the signals? -
What about the future async update guarantees?
Are we okay rejecting any updates if a handler is not registered by the end of a workflow task? This is different than signals, which are unidirectional and are okay to buffer. -
What if user wants to run a local activity before registering a dynamic update handler?
Does using this one time yield mechanism prevent that case?
I'm leaning towards having a solution where we buffer updates and (at least with the fast path) we reject them at the end of workflow task processing.
What was changed
Adds a conditional yield point during update processing to allow the scheduler to do other things like registering update handlers.
Why?
If the first WFT includes an update we can have a problem because the workflow function itself hasn't run yet to register update handlers and the update will be rejected. Adding this yield allows the scheduler to execute the workflow function up to the first blocking point, at which point the handler(s) will be registered by user code.
Checklist
Closes
How was this tested:
Unit test here and I was able to remove all the time.Sleep(1*time.Second) from the features tests.
No