Remove cancelation commands when underlying futures are closed #275
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This change filters out activity and timer cancellation commands before sending back the result of the workflow task to Temporal server. It does this when those commands are associated with activities or timers that have already been canceled.
Motivation
Existing code in
future.rb
already ensures that cancellation is a no-op on activity and timer futures that have already completed at that point in workflow execution. However, this doesn't cover all cases where an activity or timer is being canceled in the same workflow task where it is completing or failing. Consider the following scenario:In an earlier workflow task:
In a later workflow task:
RequestCancelActivityTaskCommand
which is put in a list to be sent back to the Temporal server.Because the future will still not be complete when the signal is received, it will be canceled and the command produced. However, later in the processing of the history window, we encounter a history event indicating that cancellation is not valid because the activity has already finished. When the
RequestCancelActivityTaskCommand
is sent to Temporal server, it will reject it as invalid and retry the workflow task. This will continue indefinitely, putting the workflow in a "stuck" state.Testing