-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid different transition output directories when build settings have the same values #14023
Comments
Ok, I have a somewhat long update to this after spending a few weeks off-and-on contemplating this and related issues: Core Background bazel/src/main/java/com/google/devtools/build/lib/skyframe/ConfiguredTargetKey.java Line 32 in 33f7648
Summarizing, Bazel has a key correctness constraint that, for every active Target across an entire build, the associated ConfiguredTargetKeys (created with a Label of that Target) must result in different output directories being computed during rule analysis. Note that Skyframe infrastructure is sufficiently intelligent to ensure that, between builds, stale keys do not cause action conflicts if their output directories match active keys.
Deeper dive on ensuring correctness of build
Additional constraints for more performant builds Further, presuming the use of action caching, there is another optimization available: avoid changing the output directory names as much as possible between incremental builds when command-line options change if those command-line options would have no other effect on the generated actions besides changing the output directory name. This leads to the following observations:
"affected by Starlark transition" is not quite right
Potential Resolutions and Related work
Addressing [3] is potentially more complex and I have two competing solutions at this time:
In practice, this could be enough to rename "affected by Starlark transition" to something closer to intended like "changed options to hash" and, if doing the diff is cheap enough (or cachable), treat all native transitions like Starlark transitions and also audit their changes. |
As a quick update after conversation with the Bazel core team (experts on the underlying Skyframe infrastructure). Likely to skip the first proposal to address [3] "Change "affected by Starlark transition" to instead be a map from option name to original (command-line) value." Instead, the second proposal should be technically feasible with similar effort (and better performance). BuildConfigurationFunction will properly request/register a dependency on the top-level configuration via the Skyframe environment (likely going to injected by using the PrecomputedValue Skyframe feature). In order for this to work correctly (and avoid unnecessary recomputation of ConfiguredTargetFunction), will need to make sure BuildConfigurationValue.equals is sensitive to the OutputDirectories object it holds (as well as the underlying BuildOptions, of course). At the moment, it looks like it relies on reference equality (and thus is overly sensitive to re-computation). |
Thanks for this update, @sdtwigg . My notes on the above: Values:
Proposal:
|
Copying some notes I took on the problem a while back: Do these match your plan? There's another place we can set hash values: In other words, we can exclusively say that a
Just follow the precedent of That includes stuff like here, which provides the top-level host configuration. It's pretty trivial to expand that to provide any other top-level context. |
Can we simply remove |
…nValue As per discussion in b/203470434, transitionDirectoryNameFragment should completely depend on the current values of the (rest of) the BuildOptions class. Thus, it is far better to have this always computed from BuildOptions when building a BuildConfigurationValue than rely on users keeping it consistent. (Other results of BuildConfigurationValue itself are themselves wholly computed from BuildOptions so placement there is a natural fit.) This naturally fixes the exec transition forgetting to update transitionDirectoryNameFragment. This fixes and subsumes #13464 and #13915, respectively. This is related to #14023 PiperOrigin-RevId: 407913175
Thanks for putting in all the work to write this up, I'm excited to see this is on a path to eliminate the horrible hacks we are currently using in our codebase to make this work. In the process of implementing a number of "unifying" starlark transitions in our codebase, one realization we've had is that it's often useful to be able to explicitly "delete" a configuration setting, meaning reset it to the command-line value. For now, we can get by with hard-coding those default values into the various transitions that need to access them, but it would eventually be nice to be able to explicitly perform a "reset" operation from a Starlark transition. It sounds like the proposed Skyframe changes would open the path to that, but I'm not familiar enough with that system to know for sure. |
To answer this question, I quickly prototyped this approach in fmeum@5eefd23. This has the obvious noted deficencies and contains debug prints, but allowed me to get a better understanding of what the computed
Based on the contents of the typical diffs, I wonder whether Edit: This seems to be available from @gregestren @sdtwigg Feel free to use (or ignore) this code under the CLA terms. If you are still worried about licensing issues, I could also submit it as a draft PR. |
Yes, there is a way. And I agree the top-level config makes a better baseline. Since @sdtwigg has plans along these lines I'd leave it to you two to converge. |
…nValue As per discussion in b/203470434, transitionDirectoryNameFragment should completely depend on the current values of the (rest of) the BuildOptions class. Thus, it is far better to have this always computed from BuildOptions when building a BuildConfigurationValue than rely on users keeping it consistent. (Other results of BuildConfigurationValue itself are themselves wholly computed from BuildOptions so placement there is a natural fit.) This naturally fixes the exec transition forgetting to update transitionDirectoryNameFragment. This fixes and subsumes bazelbuild#13464 and bazelbuild#13915, respectively. This is related to bazelbuild#14023 PiperOrigin-RevId: 407913175
* Move transitionDirectoryNameFragment calculation to BuildConfigurationValue As per discussion in b/203470434, transitionDirectoryNameFragment should completely depend on the current values of the (rest of) the BuildOptions class. Thus, it is far better to have this always computed from BuildOptions when building a BuildConfigurationValue than rely on users keeping it consistent. (Other results of BuildConfigurationValue itself are themselves wholly computed from BuildOptions so placement there is a natural fit.) This naturally fixes the exec transition forgetting to update transitionDirectoryNameFragment. This fixes and subsumes #13464 and #13915, respectively. This is related to #14023 PiperOrigin-RevId: 407913175 * Properly account for StarlarkOptions at their default (=null) when calculating ST-hash Previously, the hash calculation was skipping including StarlarkOptions that happened to be at their default values. This is wrong since those values may still be in "affected by Starlark transition" (because either the commandline set them and the Starlark transition reset them to their Starlark defaults thus still requiring a hash change OR the commandline did not set them but a series of Starlark transitions did an default->B->default anyways causing the Starlark option to still be 'stuck' in "affected by Starlark transition"). Resolves #14239 PiperOrigin-RevId: 408701552 Co-authored-by: twigg <twigg@google.com>
As another piece of work (to be done after removal of "affected by Starlark transition" so tjat the results native transitions are also being properly diffed against the top-level configuration), we can then remove the ad hoc output directories distinguishers used by AppleConfiguration and AndroidConfiguration in their configurationDistinguisher option. (Morally, the configurationDistinguisher is very similar to "affected by Starlark transition" and "transition directory name fragment" in that they are wholly functions of the current configuration and its diff to other configurations. Thus, their function should be wholly subsumed by the more generic calculation of ST-hash.) |
See OptionMetadataTag.java for documentation on what the tag does. Also audited the transitive calls of OutputDirectories.buildMnemonic and annotated options that seemingly can safely have the tag. (Note that some, like AppleConfiguration, have a getOutputDirectory that is too intricate for me to comfortably add the tag to any of the associated options.) Related to bazelbuild/bazel#14023. PiperOrigin-RevId: 410335434
See OptionMetadataTag.java for documentation on what the tag does. Also audited the transitive calls of OutputDirectories.buildMnemonic and annotated options that seemingly can safely have the tag. (Note that some, like AppleConfiguration, have a getOutputDirectory that is too intricate for me to comfortably add the tag to any of the associated options.) Related to #14023. PiperOrigin-RevId: 410335434
@sdtwigg @gregestren Could you provide a brief update on the efforts to make |
@sdtwigg has some pending code at https://bazel-review.googlesource.com/c/bazel/+/181730 that I'm not aware of anything blocking it from merging. I don't know his ETA for the necessary followups. I'm taking a holiday break through the new year but I'll make sure to follow up after Jan 4. |
…_baseline` Flipping as per bazelbuild#14023 (comment). Work towards bazelbuild#14023
…_baseline` Flipping as per bazelbuild#14023 (comment). Work towards bazelbuild#14023
…tal_exec_configuration_distinguisher` Flipping as per bazelbuild#14023 (comment). Also changes the platform suffix in the exec cfg to a fixed `exec` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. Work towards bazelbuild#14023
…tal_exec_configuration_distinguisher` Flipping as per bazelbuild#14023 (comment). Also changes the platform suffix in the exec cfg to a fixed `exec` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. Work towards bazelbuild#14023
…tal_exec_configuration_distinguisher` Flipping as per bazelbuild#14023 (comment). Also changes the platform suffix in the exec cfg to a fixed `exec` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. Work towards bazelbuild#14023
…tal_exec_configuration_distinguisher` Flipping as per bazelbuild#14023 (comment). Also changes the platform suffix in the exec cfg to a fixed `exec` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. Work towards bazelbuild#14023
…tal_exec_configuration_distinguisher` Flipping as per bazelbuild#14023 (comment). Also changes the platform suffix in the exec cfg to a fixed `exec` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards #14023 Closes #16910. PiperOrigin-RevId: 523999034 Change-Id: Id5548a00b62ebfe7889cd40177995210ecc224c1
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023 Closes bazelbuild#16910. PiperOrigin-RevId: 523999034 Change-Id: Id5548a00b62ebfe7889cd40177995210ecc224c1
Flips `--experimental_output_directory_naming_scheme` and `--experimental_exec_configuration_distinguisher` to `diff_against_baseline` and `off`, respectively, which makes it so that Starlark transitions can return to previous configurations, thus improving cache hit rates. Also changes the fixed platform suffix in the exec configuration from an empty string to `"exec"` for compatibility with logic that looks for the substring `-exec-` in paths to determine whether the current configuration is the exec configuration. As a result of this commit, output directory paths can change in the following ways: * `-ST-<hash>` suffixes can change or disappear and, in rare cases, appear where they previously didn't * for exec configuration directories, `k8-opt-exec-2B5CBBC6` becomes `k8-opt-exec-ST-011b9bdc32ed` This may result in tool paths that are seven characters longer, which can cause builds on Windows to fail that were already very close to the `MAX_PATH` limit. A follow-up commit will reduce the length of the hash by three characters, bringing that increase down to four. Work towards bazelbuild#14023 Closes bazelbuild#16910. PiperOrigin-RevId: 523999034 Change-Id: Id5548a00b62ebfe7889cd40177995210ecc224c1
…tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to #14023 and #18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653
…tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to bazelbuild#14023 and bazelbuild#18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653
…tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to bazelbuild#14023 and bazelbuild#18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653
…tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to bazelbuild#14023 and bazelbuild#18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653
…tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to bazelbuild#14023 and bazelbuild#18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653
…ut_direc… (#19514) …tory_naming_scheme. To support, a new SkyFunction is introduced, BaselineOptionsFunction, with corresponding SkyValue and SkyKey. Also, a new function ExecutionTransitionFactory.createTransition was added to immediately create an ExecutionTransition. When --experimental_output_directory_naming_scheme=diff_against_dynamic_baseline, the behavior is similar to diff_against_baseline, except when "is Exec" is true (that is, an ExecutionTransition has previously been applied), the baseline is chosen to be result of the ExecutionTransition being applied to the normal baseline. This is meant to resolve an issue with performance when using remote execution engines that collate requests from multiple users. They were seeing varying output paths for actions in an execution configuration depending on how much 'resetting' the exec transition was doing because of varying user commandlines. This relates to #14023 and #18002 PiperOrigin-RevId: 537097551 Change-Id: I72966406b7b8af0174a81b638bf0d1fb62466653 Co-authored-by: Googler <twigg@google.com>
Description of the problem / feature request:
To avoid duplicated actions on dependencies when using build-settings it's necessary to reset build-settings to it's default value. This is currently not enough as we still can get conflicts on the "transition directory name fragment" for instance when two subsystems using different build-settings depend on a common library.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
The below example shows the issue with two build settings (used by B and C) triggering and action conflict in a common library (ExtDep) which appears in three branches from the top target:
A -> B -> ExtDep
A -> B -> C -> ExtDep
A -> ExtDep
(issue also provoked withA -> C
)B sets a non-default value for bs2 in the input transition and resets it to default value in the edge transition to ExtDep.
Rule names below indicate which transitions they perform, eg the name "input_bs1_extdeps_bs2" indicates that it sets "bs1" build setting in the input transition and "bs2" build setting in the edge transition on the "extdeps" attribute.
rules.bzl
BUILD.bazel
Example run:
Opts passed to transitionDirectoryNameFragment function:
What operating system are you running Bazel on?
Linux
What's the output of
bazel info release
?development version
If
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.What's the output of
git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?not applicable
Issue isolated on a custom version of v5.0.0-pre.20210604.6 but originally found in 4.x versions
Have you found anything relevant by searching the web?
Related issues: #12171 #12731
Any other information, logs, or outputs that you want to share?
Prior to #13580 it's possible to trigger the issue with only
B -> ExtDeps
andB -> C -> ExtDeps
given that bs1 is not included in the extdeps edge transition (found with git bisect building //:B above but using input_outdir_bs1_bs2_extdeps_bs2 instead of input_outdir_bs1_bs2_extdeps_bs1_bs2)#13587 in combination with a custom output directory scheme using transition on "//command_line_option:output directory name" is designed to address all issues leading to conflicting "transition directory name fragment". This transition turns these types of issues into build failures due to action conflicts rather than silently duplicating actions in different output folders.
The text was updated successfully, but these errors were encountered: