Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a new constructor to override retry policy in ContinueAsNewError #1383

Merged
merged 11 commits into from
Feb 14, 2024

Conversation

jiezhang
Copy link
Contributor

@jiezhang jiezhang commented Feb 8, 2024

What was changed

Introduce a new constructor, NewContinueAsNewErrorWithOptions , to override retry policy in ContinueAsNewError.

Why?

For a long running workflow that is restarted periodically using NewContinueAsNewError, this new option could be used to override the existing RetryPolicy. Without this feature, we would need re-submit the workflow using StartWorkflowOptions.

Note that this option is already available in sdk-java: https://github.com/temporalio/sdk-java/blob/72ebff1dcb2c14b3aecc7de14a2b9aa21ef9cc4a/temporal-sdk/src/main/java/io/temporal/workflow/ContinueAsNewOptions.java#L95

Checklist

  1. Closes Add an option to override retry policy #1384

  2. How was this tested:

go run . unit-test -run Test_ContinueAsNewError
=== RUN   Test_ContinueAsNewError
--- PASS: Test_ContinueAsNewError (0.00s)
=== RUN   Test_ContinueAsNewErrorWithOptions
--- PASS: Test_ContinueAsNewErrorWithOptions (0.00s)
PASS

go run . integration-test -run TestIntegrationSuite/TestContinueAsNew
--- PASS: TestIntegrationSuite (19.30s)
    --- PASS: TestIntegrationSuite/TestContinueAsNew (4.08s)
    --- PASS: TestIntegrationSuite/TestContinueAsNewCarryOver (4.03s)
    --- PASS: TestIntegrationSuite/TestContinueAsNewWithRetryPolicy (4.05s)
    --- PASS: TestIntegrationSuite/TestContinueAsNewWithWithChildWF (6.09s)
PASS

go run . integration-test -run TestIntegrationSuite/TestChildWFWithRetryPolicy
--- PASS: TestIntegrationSuite (11.31s)
    --- PASS: TestIntegrationSuite/TestChildWFWithRetryPolicy_LongRunning (4.15s)
    --- PASS: TestIntegrationSuite/TestChildWFWithRetryPolicy_LongRunningWithCustomRetry (6.07s)
    --- PASS: TestIntegrationSuite/TestChildWFWithRetryPolicy_ShortLived (0.04s)
PASS
  1. Any docs updates needed?
    N/A

@CLAassistant
Copy link

CLAassistant commented Feb 8, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@jiezhang jiezhang force-pushed the retry branch 2 times, most recently from 8cfce96 to 3f724db Compare February 8, 2024 22:40
@jiezhang jiezhang marked this pull request as ready for review February 8, 2024 22:42
@jiezhang jiezhang requested a review from a team as a code owner February 8, 2024 22:42
@Quinn-With-Two-Ns
Copy link
Contributor

Thanks for the contribution, Can you please add integration tests to verify this change? Just a simple test showing when you continue-as-new the RetryPolicy in WorkflowInfo is updated.

@jiezhang jiezhang changed the title Introduce an option to override retry policy Introduce a workflow option to override retry policy Feb 9, 2024
internal/error.go Outdated Show resolved Hide resolved
@jiezhang jiezhang force-pushed the retry branch 2 times, most recently from c5634fa to 759bf74 Compare February 12, 2024 20:35
@jiezhang jiezhang changed the title Introduce a workflow option to override retry policy Introduce an option to override retry policy in ContinueAsNewError Feb 13, 2024
@jiezhang jiezhang changed the title Introduce an option to override retry policy in ContinueAsNewError Introduce a new constructor to override retry policy in ContinueAsNewError Feb 13, 2024
// For backward compatibility, we can't introduce an option, say WithWorkflowRetryPolicy,
// to override the retry policy attached to the workflow context.
// See #676 for more details.
RetryPolicy *commonpb.RetryPolicy
Copy link
Member

@cretz cretz Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RetryPolicy *commonpb.RetryPolicy
RetryPolicy *RetryPolicy

Can we make this the user-friendly version? (I understand other things like input/header are raw, but there's no value in exposing a raw retry policy IMO)

Copy link
Contributor Author

@jiezhang jiezhang Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commonpb.RetryPolicy is currently used in types such as WorkflowOptions, ExecuteActivityOptions. I guess it's slightly better than serializing using the json format, but I'm not exactly sure. Happy to change the type once you confirm that it is okay to deviate from the existing pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WorkflowOptions

Which workflow options? client.StartWorkflowOptions accepts a non-proto RetryPolicy as does workflow.ChildWorkflowOptions, but it's possible we have left some proto ones around.

ExecuteActivityOptions

Which execute activity options? workflow.ActivityOptions accepts a non-proto RetryPolicy, but it's possible we have left some proto ones around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WorkflowOptions:

RetryPolicy *commonpb.RetryPolicy

ExecuteActivityOptions:

RetryPolicy *commonpb.RetryPolicy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContinueAsNewError:

Input *commonpb.Payloads

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all of those are user facing and it's acknowledged that we put raw input/headers. But I think most of our user-facing options sets that accept retry policy (start workflow options, child workflow options, activity options, etc) accept the non-proto form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error type gets serialized over the wire right? Any implications if a non-primitive type is used in such a struct? If it can be used interchangeably, why do we need commonpb.RetryPolicy in the first place? @cretz

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it can be used interchangeably, why do we need commonpb.RetryPolicy in the first place?

Explained in Slack, but that's the proto our API recognizes but is a bit annoying for users to use so we have a higher level form.

@@ -459,6 +471,17 @@ func NewContinueAsNewError(ctx Context, wfn interface{}, args ...interface{}) er
return i.NewContinueAsNewError(ctx, wfn, args...)
}

// NewContinueAsNewErrorWithOptions creates ContinueAsNewError instance with additional options.
func NewContinueAsNewErrorWithOptions(ctx Context, wfn interface{}, options ContinueAsNewErrorOptions, args ...interface{}) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func NewContinueAsNewErrorWithOptions(ctx Context, wfn interface{}, options ContinueAsNewErrorOptions, args ...interface{}) error {
func NewContinueAsNewErrorWithOptions(ctx Context, options ContinueAsNewErrorOptions, wfn interface{}, args ...interface{}) *ContinueAsNewError {

To match things that accept options and varargs (e.g. Client.ExecuteWorkflow) let's put the options before the workflow. Also, I wonder if we would like to return the actual continue as new error so they can alter things if they'd like. @Quinn-With-Two-Ns - preference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few counter-examples listing the options as the last parameter:

  • NewApplicationErrorWithOptions (within the same file)
  • RegisterWorkflowWithOptions

In addition, none of the constructor within this file returns the concrete type (such as *ContinueAsNewError, *ApplicationError.

Please confirm if it's a better convention and I'll update the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the options as the last parameter:

But it isn't last, it's before args. The concern was "accept options and varargs". Would definitely support options last if it could be last, but in this case, I think it's worth the workflow and args being next to each other instead of split. Would like @Quinn-With-Two-Ns's opinion here though.

In addition, none of the constructor within this file returns the concrete type (such as *ContinueAsNewError, *ApplicationError.

Yeah, I don't have a strong opinion here. I am ok continuing to return just error. Can get others' opinions too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the ordering of parameters in NewContinueAsNewErrorWithOptions, but I am keeping the return type as error.

internal/error.go Show resolved Hide resolved
internal/error.go Outdated Show resolved Hide resolved
@jiezhang jiezhang requested a review from cretz February 13, 2024 17:30
@jiezhang
Copy link
Contributor Author

@cretz Can you unblock the tests?

@cretz
Copy link
Member

cretz commented Feb 13, 2024

Sure, done! (and I'll try to keep on top of that for each commit you make henceforth after quick scan of commit)

@@ -459,6 +477,20 @@ func NewContinueAsNewError(ctx Context, wfn interface{}, args ...interface{}) er
return i.NewContinueAsNewError(ctx, wfn, args...)
}

// NewContinueAsNewErrorWithOptions creates ContinueAsNewError instance with additional options.
func NewContinueAsNewErrorWithOptions(ctx Context, options ContinueAsNewErrorOptions, wfn interface{}, args ...interface{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awkward this doesn't go through the NewContinueAsNewError interceptor, but doing it properly would require changing the interceptors type signature to take ContinueAsNewErrorOptions. @cretz thoughts?

Copy link
Member

@cretz cretz Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does go through the existing NewContinueAsNewError interceptor for the initial creation (which can set this value too), just not with the options, correct. I think that's good enough for now. (I would rather not change signature, but add a new interceptor method if we must intercept this separately, but not convinced of that yet)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was initially implemented in the interceptor but due to compatibility concerns the error cannot inherit the RetryPolicy defined in the WorkflowOptions

7da2d4f#diff-18f3f0e55bce44d5acb84164909bef0b9e3ad611d72e84f93db66c45b7218eaaR498

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM assuming @Quinn-With-Two-Ns is also ok with this

@jiezhang
Copy link
Contributor Author

/merge

@jiezhang
Copy link
Contributor Author

@cretz Can you help to merge the PR?

@cretz cretz merged commit 30da688 into temporalio:master Feb 14, 2024
12 checks passed
@jiezhang jiezhang deleted the retry branch February 14, 2024 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an option to override retry policy
4 participants