Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote: make the dynamic spawn scheduler work. Fixes #8646 #8648

Closed
wants to merge 5 commits into from

Conversation

buchgr
Copy link
Contributor

@buchgr buchgr commented Jun 17, 2019

This change fixes the correctness issue of dynamic spawn scheduler when being used with remote execution. See #8646 for more details.

There's a performance issue remaining: #8647

@buchgr buchgr requested a review from ishikhman June 17, 2019 12:21
@buchgr buchgr requested a review from ola-rozenfeld as a code owner June 17, 2019 12:21
Copy link
Contributor

@ishikhman ishikhman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, is it that simple?

Other than failing CI, LGTM :)

@@ -304,6 +304,10 @@ private SpawnResult downloadAndFinalizeSpawnResult(
SpawnExecutionContext context,
RemoteOutputsMode remoteOutputsMode)
throws ExecException, IOException, InterruptedException {
// Ensure that when using dynamic spawn strategy that we are the only ones writing to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reword: "Ensure that we are the only ones writing to the output files when using the dynamic spawn strategy."

Also not sure why the bug link in the code is necessary. Yes, there was a bug that's fixed with this change. Reference it in the commit description instead?

buchgr added 3 commits June 18, 2019 15:56
This change fixes the correctness issue of dynamic spawn scheduler when
being used with remote execution. See bazelbuild#8646 for more details.

There's a performance issue remaining: bazelbuild#8467
between RemoteSpawnRunnerTest and GrpcRemoteExecutionClientTest.
@buchgr buchgr force-pushed the fix-dynamic-spawn-strategy branch from f642b98 to 4399f02 Compare June 18, 2019 13:56
@buchgr
Copy link
Contributor Author

buchgr commented Jun 18, 2019

I updated the code to also include the .tmp download optimization. ptal.

@ishikhman
Copy link
Contributor

I updated the code to also include the .tmp download optimization. ptal.

Do you mean #8647 or something else?

@buchgr
Copy link
Contributor Author

buchgr commented Jun 18, 2019

yes #8647

Copy link
Contributor

@ishikhman ishikhman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the commit message/PR description that #8647 is also addressed here.

Do we have integration tests for dynamic strategy?

@buchgr
Copy link
Contributor Author

buchgr commented Jun 18, 2019

I ll update the commit message when importing. I split it out into three commits for easy review :).

Do we have integration tests for dynamic strategy?

I am quite sure we don't. It seems hard to test in an integration test?

@ishikhman
Copy link
Contributor

I ll update the commit message when importing. I split it out into three commits for easy review :).

thanks! :)

Do we have integration tests for dynamic strategy?

I am quite sure we don't. It seems hard to test in an integration test?

well, maybe it is hard to test details (like whether the files were downloaded to the temp directory and then moved, racing conditions), but we can still have test for a base case, to make sure that it works, i.e. not failing and outputs in the end are in the right place.

@buchgr
Copy link
Contributor Author

buchgr commented Jun 19, 2019

well, maybe it is hard to test details (like whether the files were downloaded to the temp directory and then moved, racing conditions), but we can still have test for a base case, to make sure that it works, i.e. not failing and outputs in the end are in the right place.

how would one do that with reliably triggering the dynamic spawn strategy?

@ishikhman
Copy link
Contributor

well, maybe it is hard to test details (like whether the files were downloaded to the temp directory and then moved, racing conditions), but we can still have test for a base case, to make sure that it works, i.e. not failing and outputs in the end are in the right place.

how would one do that with reliably triggering the dynamic spawn strategy?

simple action with --remote_executor pointing to a RemoteWorker and enabled dynamic spawn strategy

@buchgr
Copy link
Contributor Author

buchgr commented Jun 19, 2019

simple action with --remote_executor pointing to a RemoteWorker and enabled dynamic spawn strategy

this does not trigger the dynamic spawn strategy. it will just trigger the remote spawn strategy.

@ishikhman
Copy link
Contributor

simple action with --remote_executor pointing to a RemoteWorker and enabled dynamic spawn strategy

this does not trigger the dynamic spawn strategy. it will just trigger the remote spawn strategy.

then how do we know that it didn't work before your fix in the first place?

@buchgr
Copy link
Contributor Author

buchgr commented Jun 19, 2019

then how do we know that it didn't work before your fix in the first place?

user reports and manual testing :\

Copy link
Contributor

@ishikhman ishikhman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that the simple integration test will be added as a separate change ;)

LGTM

@bazel-io bazel-io closed this in d75b6cf Jun 19, 2019
siberex pushed a commit to siberex/bazel that referenced this pull request Jul 4, 2019
This change fixes the correctness issue of dynamic spawn scheduler when being used with remote execution. See bazelbuild#8646 for more details.

There's a performance issue remaining: bazelbuild#8647

Closes bazelbuild#8648.

PiperOrigin-RevId: 253998300
irengrig pushed a commit to irengrig/bazel that referenced this pull request Jul 15, 2019
This change fixes the correctness issue of dynamic spawn scheduler when being used with remote execution. See bazelbuild#8646 for more details.

There's a performance issue remaining: bazelbuild#8647

Closes bazelbuild#8648.

PiperOrigin-RevId: 253998300
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants