Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor/dirtools: ensure UploadTree set result symlinks #3800

Merged
merged 2 commits into from
Apr 27, 2023

Conversation

sluongng
Copy link
Contributor

@sluongng sluongng commented Apr 19, 2023

Currently we only support OutputDirs and OutputFiles. Therefore actions
which produce only symlinks would fail when executed remotely with
BuildBuddy.

See bazel-contrib/rules_go#3539 as an example.

Implement support for OutputFileSymlinks and OutputDirectorySymlinks.

Related issues: Fixes https://github.com/buildbuddy-io/buildbuddy-internal/issues/2234

@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch 2 times, most recently from 5c0a6ad to 7c9dda6 Compare April 20, 2023 12:36
@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch from 7c9dda6 to 4ea434d Compare April 21, 2023 08:04
@sluongng
Copy link
Contributor Author

To summarize my recent finding:

From Bazel's RemoteExecutionService.java, it seems like Bazel has never made the switch to use output_symlinks field in RE proto.

Instead, it continues using output_directory_symlinks and output_file_symlinks, which led to many confusion on my end why the second iteration of this PR did not work.

Digging up historical records revealed that bazelbuild/remote-apis#96 and bazelbuild/remote-apis#75 were proposed by Googler, but the proto in Bazel was never updated accordingly and only later bump to catchup with some later specs. So it's safe to say that there was very little effort from Bazel side in adopting output_symlinks.


Leveraging the clause in RE API spec:

// DEPRECATED as of v2.1. Servers that wish to be compatible with v2.0 API
// should still populate this field in addition to output_symlinks.

Let's support both new and old fields.

@sluongng sluongng marked this pull request as ready for review April 21, 2023 14:57
@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch 3 times, most recently from a7cd2f4 to ba59016 Compare April 24, 2023 10:06
@@ -357,6 +355,32 @@ func UploadTree(ctx context.Context, env environment.Env, dirHelper *DirHelper,
if err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering adding it here

				if !(dirHelper.ShouldUploadFile(fqfn) || dirHelper.IsOutputPath(fqfn)) {
					continue
				}

But I was worried about breaking existing behavior.

It's unclear to me how often people create/include symlink outputs in their actions without declaring them in Command.output_path, and if we should start explicitly requiring folks to declare them.

@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch from c4c6b40 to 88c6ef9 Compare April 24, 2023 13:46
@sluongng
Copy link
Contributor Author

sluongng commented Apr 24, 2023

Added some basic tests to make this easier to maintain in the long-run 🙏

@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch from 8e93141 to e51a3f7 Compare April 24, 2023 16:36
actionResult.OutputFileSymlinks = append(actionResult.OutputFileSymlinks, ospb)
}
}
actionResult.OutputSymlinks = append(actionResult.OutputSymlinks, ospb)
Copy link
Member

@bduffany bduffany Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of output_symlinks says:

  // New in v2.1: this field will only be populated if the command
  // `output_paths` field was used, and not the pre v2.1 `output_files` or
  // `output_directories` fields.

So based on this part: this field will only be populated if the command output_paths field was used

I think there should be a check around this line like

if IsOutputPath(fqfn) {
  actionResult.OutputSymlinks = append(actionResult.OutputSymlinks, ospb)
}

However, it looks like IsOutputPath is not technically checking the output_paths field 😕 . In workspace.go > SetTask, we are setting outputPaths to either output_paths if set, else concat(output_files, output_directories). This is pretty confusing IMO - we might want to refactor that. But for now we could maybe just pass in the command proto so that we have access to the original output_paths field.

enterprise/server/remote_execution/dirtools/dirtools.go Outdated Show resolved Hide resolved
Comment on lines 380 to 388
if targetInfo.IsDir() {
actionResult.OutputDirectorySymlinks = append(actionResult.OutputDirectorySymlinks, ospb)
} else {
actionResult.OutputFileSymlinks = append(actionResult.OutputFileSymlinks, ospb)
}
Copy link
Member

@bduffany bduffany Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of output_directory_symlinks says:

  // The output directories of the action that are symbolic links to other
  // directories.

similarly for output_file_symlinks:

  // The output files of the action that are symbolic links to other files.

I am interpreting "the output files of the action" to mean Action.command.output_files and "the output directories of the action" to mean Action.command.output_directories.

If that's correct then we need to check that this symlink file's path has an exact match in output_files or output_directories before we add them to the ActionResult.output_file_symlinks or ActionResult.output_directory_symlinks (respectively).

(Note that this function begins directory traversal starting from the workspace root, not any of the specific output directories or output file/symlink paths, otherwise we wouldn't need an explicit check.)

@sluongng sluongng changed the title executor/dirtools: ensure UploatTree set ActionResult.OutputSymlinks executor/dirtools: ensure UploatTree set result symlinks Apr 25, 2023
@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch from 3c69693 to 0e3c391 Compare April 25, 2023 13:09
Copy link
Member

@bduffany bduffany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking a lot cleaner! Just a few style/testing-related comments

enterprise/server/remote_execution/dirtools/dirtools.go Outdated Show resolved Hide resolved
enterprise/server/remote_execution/dirtools/dirtools.go Outdated Show resolved Hide resolved
OutputFiles: []string{"a/fileA.txt"},
OutputDirectories: []string{"a"},
Copy link
Member

@bduffany bduffany Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding but I think an OutputDirectory won't include an OutputFile as its child, otherwise the child will get uploaded twice. I think a more representative example would be to have "b" as an output directory, then in the fileContents map below, create a few files fileB.txt and fileC.txt under b

@sluongng sluongng changed the title executor/dirtools: ensure UploatTree set result symlinks executor/dirtools: ensure UploadTree set result symlinks Apr 25, 2023
@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch 2 times, most recently from 4e78e79 to 3175ae6 Compare April 25, 2023 17:24
Currently we only support OutputDirs and OutputFiles. Therefore actions
which produce only symlinks would fail when executed remotely with
BuildBuddy.

See bazel-contrib/rules_go#3539 as an example.

Implement support for OutputFileSymlinks and OutputDirectorySymlinks.

Related issues: Fixes buildbuddy-io/buildbuddy-internal#2234
@sluongng sluongng force-pushed the sluongng/dirtools-outputsymlinks branch from 3175ae6 to 0b6eded Compare April 26, 2023 09:16
Co-authored-by: Brandon Duffany <brandon@buildbuddy.io>
Copy link
Member

@bduffany bduffany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@sluongng sluongng merged commit eb3e711 into master Apr 27, 2023
@sluongng sluongng deleted the sluongng/dirtools-outputsymlinks branch April 27, 2023 08:37
sluongng added a commit that referenced this pull request Apr 27, 2023
This is a follow-up of #3800

Implement support for ActionResult.output_symlinks by checking usages of
Command.output_paths.

Added tests to verify different cases, including a hybrid case where Bazel
could be using both the new `Command.output_paths` the old fields
`Command.output_files` and `Command.output_directories`.

Added tests to verify different dangling symlink cases.
sluongng added a commit that referenced this pull request May 3, 2023
Used this recently to help with writing tests for #3800 and #3861.
Super helpful to identify code branches with missing tests.
sluongng added a commit that referenced this pull request May 5, 2023
This is a follow-up of #3800

Implement support for ActionResult.output_symlinks by checking usages of
Command.output_paths.

Added tests to verify different cases, including a hybrid case where Bazel
could be using both the new `Command.output_paths` the old fields
`Command.output_files` and `Command.output_directories`.

Added tests to verify different dangling symlink cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants