Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Nexus test env to respect ScheduleToCloseTimeout #1636

Merged
merged 2 commits into from
Oct 1, 2024

Conversation

rodrigozhou
Copy link
Contributor

@rodrigozhou rodrigozhou commented Sep 16, 2024

What was changed

Fix Nexus test env to respect ScheduleToCloseTimeout and return operation time out error.

Why?

Users can test the ScheduleToCloseTimeout when writing their tests.

Checklist

  1. Closes

  2. How was this tested:

  1. Any docs updates needed?

@rodrigozhou rodrigozhou requested a review from a team as a code owner September 16, 2024 21:22
Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task was to actually enforce the timeout, e.g. if the schedule-to-close-timeout is 1 hour, there should be a timer set to time out the operation when that timer fires.

@rodrigozhou rodrigozhou force-pushed the rodrigozhou/nexus-test-timeout branch 2 times, most recently from 45ff424 to a3e3d3e Compare September 26, 2024 04:23
@rodrigozhou rodrigozhou force-pushed the rodrigozhou/nexus-test-timeout branch from a3e3d3e to 74ec134 Compare September 26, 2024 16:35
Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a couple of small comments but otherwise LGTM.

@rodrigozhou rodrigozhou requested a review from a team September 26, 2024 21:41
@@ -2363,6 +2367,37 @@ func (env *testWorkflowEnvironmentImpl) ExecuteNexusOperation(params executeNexu
}
env.runningNexusOperations[seq] = handle

var opID string
if params.options.ScheduleToCloseTimeout > 0 {
// Timer to fail the nexus operation due to schedule to close timeout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, do we usually mimic timeouts like these in test suites? How does activity execution work in the test suite with regards to schedule to close timeout? Everything LGTM and will mark approved, just want to make sure this is normal for our test suite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some timeouts are broken in the test env (according to @rodrigozhou who looked into it). AFAIC, this is part of the behavior for nexus operations. It's enforced in the real server and the Java test server, don't see a good reason not to enforce it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to understand/confirm how timeouts are expected to work in the Go test suite for external calls. I admit being unfamiliar. Do activities in the test suite respect their timeouts? (they might, I just want to check)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an activity is run by the workflow test environment (not mocked) we do propagate the timeout in the activities environment and rely on the activity to return to fail the operation which isn't how the real server works of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Works for me

@@ -2887,6 +2923,10 @@ func (h *testNexusOperationHandle) completedCallback(result *commonpb.Payload, e
// startedCallback is a callback registered to handle operation start.
// Must be called in a postCallback block.
func (h *testNexusOperationHandle) startedCallback(opID string, e error) {
if h.started {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we get duplicate starts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the timer and start request race.

@@ -960,6 +1047,77 @@ func TestWorkflowTestSuite_WorkflowRunOperation_WithCancel(t *testing.T) {
}
}

func TestWorkflowTestSuite_NexusSyncOperation_ScheduleToCloseTimeout(t *testing.T) {
sleepDuration := 500 * time.Millisecond
op := temporalnexus.NewSyncOperation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sync operation should also have a task timeout of 10s right? Are we enforcing that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Good call, we should set the Request-Timeout header here as well.
I'll submit a PR for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rodrigozhou rodrigozhou merged commit f0ac2ee into master Oct 1, 2024
14 checks passed
@rodrigozhou rodrigozhou deleted the rodrigozhou/nexus-test-timeout branch October 1, 2024 21:59
ReyOrtiz pushed a commit to ReyOrtiz/temporal-sdk-go that referenced this pull request Dec 5, 2024
* Fix Nexus test env to respect ScheduleToCloseTimeout

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants