-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetching symlink inputs for top-level output #11536
Conversation
@buchgr (who wrote the code originally) |
itym @anakanemison |
@anakanemison is not linked to Google. Might want to do that? |
ActionExecutionMetadata action = | ||
(ActionExecutionMetadata) analysisResult.getActionGraph().getGeneratingAction((Artifact) actionInput); | ||
if (action.mayInsensitivelyPropagateInputs()) { | ||
filesToDownload.addAll(action.getInputs().toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for there to be more than one symlink-ish action chained together? My concern is that this may be insufficient if the inputs to this action are outputs of other symlink actions. Rewinding recurses in such a case. Perhaps that would be overkill here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
@@ -282,13 +283,13 @@ public BuildOptions getDefaultBuildOptions(BlazeRuntime runtime) { | |||
* @param env the command environment | |||
* @param request the build request | |||
* @param buildOptions the build's top-level options | |||
* @param configuredTargets the build's requested top-level targets as {@link ConfiguredTarget}s | |||
* @param analysisResult the build's requested top-level targets as {@link ConfiguredTarget}s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this description deserves some rewording, given that the parameter's type changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, reworded.
Sorry for the delay. I'm not used to GitHub! I think I'm in the Google org on GitHub already. |
No worries, and thanks for the review. GitHub seems to think that @anakanemison is not in the Google org. For example, you can compare your profile page with Jakob's: @buchgr vs. @anakanemison Did you add your @google.com address to your GitHub profile? |
Thanks for pointing out my empty profile! I entered some information there. (Looks like I could say anything I wanted there..) LGTM. I'll do the internal merging thing and respond here when the change is in. |
|
||
private static void fetchSymlinkDependenciesRecursively( | ||
ActionGraph actionGraph, Set<ActionInput> builder, Artifact artifact) { | ||
ActionExecutionMetadata action = (ActionExecutionMetadata) actionGraph.getGeneratingAction(artifact); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if a top-level artifact is a tree artifact generated by a template action? I think that can happen?
If so, then it should be ok to ignore non-ActionExecutionMetadata instances, since templates shouldn't insensitively propagate their inputs (although I guess theoretically there could be a template action that just symlinked its inputs?). Could pull #mayInsensitivelyPropagateInputs() up to ActionAnalysisMetadata too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Right now that leads to a Bazel crash. Fixed!
The main issue right now are SymlinkAction and SymlinkTreeAction, but in general, any remotely run action can return a symlink in the open-source remote execution API. Therefore, this will eventually need to be rewritten to follow symlinks at execution time. The current design mixes up action execution and output fetching, and separating them should actually result in a nicer implementation, because we can then remove the code that injects remote execution flags into Skyframe.
So yeah, this is a short-term fix to get cc_binary to work properly, and there's more work needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a TODO for that somewhere appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, now that I understand this better, could you add a test exhibiting this behavior? I assume it would be a Starlark action that created a symlink on the remote worker?
The behavior of creating symlinks on the remote worker seems very problematic to me. What happens if the action creates an intermediate file and then its final output is a symlink to that file? We'll just download a dangling symlink? Is there any chance of getting rid of this symlinking feature altogether?
Added a test for top-level template actions and fixed the resulting ClassCastException. PTAL. |
|
||
private static void fetchSymlinkDependenciesRecursively( | ||
ActionGraph actionGraph, Set<ActionInput> builder, Artifact artifact) { | ||
ActionExecutionMetadata action = (ActionExecutionMetadata) actionGraph.getGeneratingAction(artifact); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a TODO for that somewhere appropriate?
3fc2282
to
a467bf8
Compare
Added a TODO as requested. Sorry for the delay - the last month was pretty hectic. |
if (remoteOutputsMode != null && remoteOutputsMode.downloadToplevelOutputsOnly()) { | ||
Preconditions.checkState(actionContextProvider != null, "actionContextProvider was null"); | ||
boolean isTestCommand = env.getCommandName().equals("test"); | ||
TopLevelArtifactContext artifactContext = request.getTopLevelArtifactContext(); | ||
ImmutableSet.Builder<ActionInput> filesToDownload = ImmutableSet.builder(); | ||
for (ConfiguredTarget configuredTarget : configuredTargets) { | ||
Set<ActionInput> filesToDownload = new HashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be Artifact instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This includes test outputs which are not declared as artifacts, so no, not easily.
|
||
private static void fetchSymlinkDependenciesRecursively( | ||
ActionGraph actionGraph, Set<ActionInput> builder, Artifact artifact) { | ||
ActionExecutionMetadata action = (ActionExecutionMetadata) actionGraph.getGeneratingAction(artifact); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, now that I understand this better, could you add a test exhibiting this behavior? I assume it would be a Starlark action that created a symlink on the remote worker?
The behavior of creating symlinks on the remote worker seems very problematic to me. What happens if the action creates an intermediate file and then its final output is a symlink to that file? We'll just download a dangling symlink? Is there any chance of getting rid of this symlinking feature altogether?
056a6ba
to
e007e02
Compare
Bazel currently errors out if the remote executor returns a dangling symlink, and also if it returns a symlink when build-without-the-bytes is enabled. |
Not quite: it returns an error if an action generates a symlink and --remote_download_minimal is active, but not if --remote_download_toplevel is active. I added a test for this. |
Note that genrules can also generate symlinks on the remote machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test!
Just the question about the declaration of Set and my new request for the logline.
} | ||
ActionExecutionMetadata action = (ActionExecutionMetadata) actionGraph.getGeneratingAction(artifact); | ||
if (action.mayInsensitivelyPropagateInputs()) { | ||
List<Artifact> inputs = action.getInputs().toList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a warning-level logline here if the list is >5 elements or so? I don't think we expect such actions to insensitively propagate inputs, and if there's a bug, we could slow down significantly due to nested set expansion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Top-level output files can be symlinks to non-top-level output files, such as in the case of a cc_binary linking against a .so built in the same build. I reuse the existing mayInsensitivelyPropagateInputs() method, which was originally introduced for action rewinding, but seems to have exactly the intended semantics here as well. Unfortunately, this requires a small change to the BlazeModule API to pass in the full analysis result (so we can access the action graph, so we can read the aformentioned flag). I considered using execution-phase information on whether an output is a symlink. However, at that point it's too late! The current design only allows pulling an output file *when its generating action runs*. It would be better if this was changed to pull output files independently of action execution. That would avoid a lot of problems with the current design, such as bazelbuild#10902. Fixes bazelbuild#11532. Change-Id: Iaf1e48895311fcf52d9e1802d53598288788a921
Change-Id: Ie1bf49a8d08f0b2422426ecd95fe79b3686f8427
633082b
to
f2c455c
Compare
Top-level output files can be symlinks to non-top-level output files, such as in the case of a cc_binary linking against a .so built in the same build. I reuse the existing mayInsensitivelyPropagateInputs() method, which was originally introduced for action rewinding, but seems to have exactly the intended semantics here as well. Unfortunately, this requires a small change to the BlazeModule API to pass in the full analysis result (so we can access the action graph, so we can read the aformentioned flag). I considered using execution-phase information on whether an output is a symlink. However, at that point it's too late! The current design only allows pulling an output file *when its generating action runs*. It would be better if this was changed to pull output files independently of action execution. That would avoid a lot of problems with the current design, such as bazelbuild#10902. Fixes bazelbuild#11532. Change-Id: Iaf1e48895311fcf52d9e1802d53598288788a921 Closes bazelbuild#11536. Change-Id: Ie1bf49a8d08f0b2422426ecd95fe79b3686f8427 PiperOrigin-RevId: 332939828
Top-level output files can be symlinks to non-top-level output files, such as
in the case of a cc_binary linking against a .so built in the same build.
I reuse the existing mayInsensitivelyPropagateInputs() method, which was
originally introduced for action rewinding, but seems to have exactly the
intended semantics here as well.
Unfortunately, this requires a small change to the BlazeModule API to pass in
the full analysis result (so we can access the action graph, so we can read
the aformentioned flag).
I considered using execution-phase information on whether an output is a
symlink. However, at that point it's too late! The current design only allows
pulling an output file when its generating action runs. It would be better if
this was changed to pull output files independently of action execution. That
would avoid a lot of problems with the current design, such as #10902.
Fixes #11532.
Change-Id: Iaf1e48895311fcf52d9e1802d53598288788a921