-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node properties: support for leaf nodes #475
Conversation
go/pkg/tool/tool.go
Outdated
} else { | ||
inputRoot = filepath.Join(actionRoot, "input") | ||
npPath := filepath.Join(actionRoot, "input_node_properties.textproto") | ||
if _, err := os.Stat(npPath); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to rewrite this with early return, but failed -- only way I saw to return early is to duplicate one of the if clauses, which was, imo, worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if-else
block seems to be doing a "set inputRoot
and nodeProperties
based on the value of actionRoot
". That sounds like a good candidate for a helper function that would facilitate using less indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you, good idea! I did one better, I think -- I split the input root handling and the node properties, because they have nothing to do with each other. PTAL.
// In v2.1 of the RE API the `output_{files, directories}` fields were | ||
// replaced by a single field: `output_paths`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proper way of supporting v2.1 is to follow the same deprecation semantics by adding an OutputPaths
field, removing the deprecated ones, and handling the precedence during this conversion here as defined in the API.
I'm concerned about unverified use-cases that might break by nilling OutputDirs
and assuming all paths fit into OutputFiles
.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem is, that deprecation never really worked out. Bazel still uses both fields (I see some recent action trying to move towards using one, but no resolution). I suspect Google's RBE still uses files/dirs. So we need to support both fields as much as possible in this client until version 3.0 actually removes the deprecated fields (and not sure if that'll ever happen...).
Here is what I think we should do:
- when composing an entirely new RE-API proto, we should decide which fields to use based on server capabilities (which we already do AFAICT).
- when parsing an RE-API proto, we need to work correctly with all versions: if
output_paths
is set, it should supercedeoutput_files/dirs
. Which is what I implement here. - the composing and parsing should be orthogonal -- the tool should support reading protos in one format and outputting them in another (e.g. action was downloaded from a different RE than it is executed on, or the RE migrates to a different API version)
The previous implementation didn't support output_paths at all, which meant this tool was not working with servers supporting 2.1, and here I fix that. AFAICT we can indeed safely put all outputs in output_files
, because we shouldn't care what they really are; if there is any code that will be broken by this assumption, we need to find out ASAP and fix it, because we actually can't know the type of the output. But I believe that's extremely unlikely. Basically, the only reason these exist as split fields in Command
to begin with is because they were split in the RE-API and we needed to support that. All we do is pass them through to RE. We do no semantic checks on them.
Adding an OutputPaths
field in Command
is possible, but imo it will only add complexity, as we'll still need to populate the RE proto based on all fields, and maintain the fields in sync. I think just using the output files field is simpler. Of course, if someone tries to run a saved 2.1 command with an output directory on a 2.0 RE, this will not work. without them manually modifying the saved proto, but that's true regardless of what field we use.
Still TODO, btw:
- make the fake server support both RE-API versions (via an option)
- use parameterized tests to test both, to make sure that we support all combinations (can run old commands on new servers, and vice versa).
- improve the tool to be more explicit when the action, command and input root digests actually change from what has been requested / found on disk. The tool currently silently recomputes a lot of digests, which can be misleading and cover up errors such as API version mismatches leading to different digests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the well laid-out comment.
My concern was specifically around losing the information about which paths are directories. As you mentioned, this is only a problem when passing a command from 2.1+ server to a 2.0- one, in which case the information (directories) wasn't or shouldn't be there in the first place.
go/pkg/tool/tool.go
Outdated
} else { | ||
inputRoot = filepath.Join(actionRoot, "input") | ||
npPath := filepath.Join(actionRoot, "input_node_properties.textproto") | ||
if _, err := os.Stat(npPath); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if-else
block seems to be doing a "set inputRoot
and nodeProperties
based on the value of actionRoot
". That sounds like a good candidate for a helper function that would facilitate using less indentation.
8d2c2c9
to
1ab18ae
Compare
go/pkg/tool/tool.go
Outdated
for _, ev := range cmdPb.EnvironmentVariables { | ||
cmd.InputSpec.EnvironmentVariables[ev.Name] = ev.Value | ||
func readNodePropertiesFromFile(path string) (nps map[string]*repb.NodeProperties, err error) { | ||
if _, err = os.Stat(path); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rearrange the conditions such that the terminal one appears first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, much better now, thank you! PTAL.
// In v2.1 of the RE API the `output_{files, directories}` fields were | ||
// replaced by a single field: `output_paths`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the well laid-out comment.
My concern was specifically around losing the information about which paths are directories. As you mentioned, this is only a problem when passing a command from 2.1+ server to a 2.0- one, in which case the information (directories) wasn't or shouldn't be there in the first place.
Progress on #460 (95% of the fix).
In this PR, only support node properties for leaf nodes (files and empty directories); the reason is that to do it for all nodes will require me to change a current invariant / output in tree.go, and I'd like to split it to a separate PR, as it is a bit more invasive change. Also, in practice support for non-leaf nodes is much less urgent, because the main goal here is to support Bazel's use case, which is to mark files as "bazel_tool_input" in order to implement persistent remote workers.