-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DeleteWorkflowExecution API to history service #2311
Add DeleteWorkflowExecution API to history service #2311
Conversation
service/history/consts/const.go
Outdated
@@ -52,6 +52,8 @@ var ( | |||
ErrActivityTaskNotFound = serviceerror.NewNotFound("invalid activityID or activity already timed out or invoking workflow is completed") | |||
// ErrWorkflowCompleted is the error to indicate workflow execution already completed | |||
ErrWorkflowCompleted = serviceerror.NewNotFound("workflow execution already completed") | |||
// ErrWorkflowIsRunning is the error to indicate workflow is still running | |||
ErrWorkflowIsRunning = serviceerror.NewInvalidArgument("workflow is running") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ErrWorkflowNotCompleted?
would be nicely symmetric to ErrWorkflowCompleted (when it shouldn't be)
proto/internal/temporal/server/api/historyservice/v1/service.proto
Outdated
Show resolved
Hide resolved
// if this is not called then callers will get mutable state even though its been removed from database | ||
workflowContext.Clear() | ||
return nil | ||
return t.deleteManager.DeleteWorkflowExecutionFromTimerTask( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing where you handle the case that the execution store returns NotFound (which I presume, in the case of a Timer-based deletion of an Execution, isn't considered an error. Of course that might be handled in code not touched by this PR ... was just wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The persistence methods all return success even if the thing they're deleting is not there. The important point is that it's calling several methods in a row. If 1 succeeds and 2 fails, then we retry the whole thing, the next time we need 1 to succeed so we can get to 2. (This is inside of shard.Context now.)
If you don't like those semantics, I guess we could have it return NotFound and have the shard context skip over that? Or to avoid the client having to retry, push all deletes through the timer queue and use that to ensure it completes? (Or both)
b16cbd8
to
8c7c9a2
Compare
// DeleteWorkflowExecution is called from transfer task queue processor | ||
// and corresponding transfer task is created only if workflow is not running. | ||
// Therefore, this should almost never happen but if it does (cross DC resurrection, for example), | ||
// workflow should not be deleted. NotFound errors are ignored by task processor. | ||
return consts.ErrWorkflowNotCompleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ErrWorkflowNotCompleted NotFound error? Or this will block transfer queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not obvious, but yes.
What changed?
Add
DeleteWorkflowExecution
API to the history service.DeleteWorkflowExecution
method in history handler calls corresponding method in history engine, which creates transfer task of newly added typeTransferDeleteExecution
and then task processor callsDeleteWorkflowExecution
method fromDeleteManager
which deletes all database records associated with workflow execution:Why?
This API will be used as part of namespace deletion process. Also it will be exposed through frontend and will replace direct DB access in
tctl
.How did you test it?
Unit test and integration tests in upcoming PR.
Potential risks
No risks.
Is hotfix candidate?
No.