Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tags propagation: Starlark rules part #8612

Closed
wants to merge 7 commits into from

Conversation

ishikhman
Copy link
Contributor

Tags declared on targets are not propagated to actions and therefore are not taken into consideration by bazel. This causes some issues, for instance, target marked with a tag 'no-remote' will still be executed remotely.
As it was agreed in the design doc (see doc and #7766 for details), set of tags to be propagated to actions as a first iteration.
This change is responsible for that first step for the Starlark Rules.

RELNOTES: tags 'no-remote', 'no-cache', 'no-remote-cache', 'no-remote-exec', 'no-sandbox' are propagated now to the actions from targets.

Closes #7766

Tags declared on targets are not propagated to actions and therefore are not taken into consideration by bazel. This causes some issues, for instance, target marked with a tag 'no-remote' will still be executed remotely.
As it was agreed in the design doc (see bazelbuild#7766 for a link), set of tags to be propagated to actions as a first iteration.
This change is responsible for that first step for the Starlark Rules.

RELNOTES: tags 'no-remote', 'no-cache', 'no-remote-cache', 'no-remote-exec', 'no-sandbox' are propagated now to the actions from targets.

Closes bazelbuild#7766
Copy link
Contributor

@buchgr buchgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of white listing tags. It's error prone. Couldn't we apply the same filter that we apply to execution_info also to tags? What's the downside?

// we do not want to propagate custom user's tags or create potential conflicts
// with execution requirements declared on the rules.
// See https://github.com/bazelbuild/bazel/issues/7766 for details.
private static final Predicate<String> TAGS_PROPAGATED_TO_EXEC_INFO =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this compatible with LEGAL_EXEC_INFO_KEYS defined a few lines above, which also filters exec requirements and the comment says "We also don't want to exhaustively enumerate all the legal values here.". Couldn't we unify these two predicates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, but this is not what we've agreed on in the initial design.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent what's in the design doc, let's not do it if it makes no sense? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to keep them separate because they serve different purposes: LEGAL_EXEC_INFO_KEYS specifies the universe of execution info keys, and TAGS_PROPAGATED_TO_EXEC_INFO specifies the specific tags that can be propagated.

TAGS_PROPAGATED_TO_EXEC_INFO is a subset of LEGAL_EXEC_INFO_KEYS

In getExecutionInfoFromTags, we only propagate tags that satisfy TAGS_PROPAGATED_TO_EXEC_INFO, irrespective of whether they satisfy LEGAL_EXEC_INFO_KEYS; I don't see how we'd do this check if the two predicates were combined.

|| tag.equals("no-cache")
|| tag.equals("no-sandbox")
|| tag.equals("no-remote-exec")
|| tag.equals("no-remote-cache");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik no-remote-exec and no-remote-cache don't exist as execution requirements (yet)? Also take a look at the ExecutionRequirements where we define names for all the available tags.

Copy link
Contributor

@SrodriguezO SrodriguezO Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also can't find any reference to those two tags having special meaning here: https://docs.bazel.build/versions/master/be/common-definitions.html#common-attributes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah they're planned tags for more control over remote caching: #7932 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those are only planned, but I decided to add them to the whitelist from the beginning. I'll think about it, perhaps to avoid confusions it would be better to not include them here and just add a comment for the future task #7932, where they will be introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Map<String, String> map = new HashMap<>();
for (String tag :
NonconfigurableAttributeMapper.of(rule).get(CONSTRAINTS_ATTR, Type.STRING_LIST)) {
// We don't want to pollute the execution info with random things, and we also need to reserve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serious question: what's the downside of "polluting" execution info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this comment block at all - should have been more careful with copy-pasting :)

map.put(tag, "");
}
}
return ImmutableMap.copyOf(map);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of a copy use an ImmutableMap in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for (String tag :
NonconfigurableAttributeMapper.of(rule).get(CONSTRAINTS_ATTR, Type.STRING_LIST)) {
// We don't want to pollute the execution info with random things, and we also need to reserve
// some internal tags that we don't allow to be set on targets. We also don't want to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we prevent that by prefixing internal execution requirements with "internal-" or so? also which are those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this comment block at all :)

* Only supported tags are included into the execution info,
* see {@link #LEGAL_EXEC_INFO_KEYS} and {@link #TAGS_PROPAGATED_TO_EXEC_INFO}.
*/
public static Map<String, String> getFilteredExecutionInfo(Object executionRequirementsUnchecked,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a javadoc describing the expected type of executionRequirementsUnchecked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
public static Map<String, String> getFilteredExecutionInfo(Object executionRequirementsUnchecked,
Rule rule) throws EvalException {
Map<String, String> executionInfo = Maps.newLinkedHashMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the LinkedHashMap as opposed to an immutablemap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

@ishikhman
Copy link
Contributor Author

I am not a fan of white listing tags. It's error prone. Couldn't we apply the same filter that we apply to execution_info also to tags? What's the downside?

This is what we've agreed on in the design doc - propagate only a fixed set of tags for now. If we apply the same filter as for execution_info, more tags will be propagated.

Copy link
Member

@hlopko hlopko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Only minor things.

"execution_requirements")));
}

Map<String, String> executionInfo =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ImmutableMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

Map<String, String> executionInfo =
TargetUtils.getFilteredExecutionInfo(executionRequirementsUnchecked, ruleContext.getRule());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to do the validation here and pass "checked" execution requirements to the TargetUtils method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this, but IMO it's better to keep it all in one place. And it's easier to test is this way ;)

// with execution requirements declared on the rules.
// See https://github.com/bazelbuild/bazel/issues/7766 for details.
private static final Predicate<String> TAGS_PROPAGATED_TO_EXEC_INFO =
tag -> tag.equals("no-remote")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As somebody completely ignorant of these things my first reaction is hmm where can I find out what these do. Wdyt about making these constants, and adding javadoc and document (or reference existing documentation) from these constants?

Map<String, String> executionInfo = Maps.newLinkedHashMap();
executionInfo.putAll(getExecutionInfoFromTags(rule));

if (executionRequirementsUnchecked != Runtime.NONE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring my comment above to validate the input before entering this method, do we need to check for None at all when we call castSkylarkDictOrNoneToDict anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, we don't need it at all

*/
private static Map<String, String> getExecutionInfoFromTags(Rule rule) {
// tags may contain duplicate values.
Map<String, String> map = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ImmutableMap.builder()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Test a basic skylark ctx.actions.run rule which has tags, that should be propagated,
# when the rule also has execution_info
function test_tags_propagated_to_run_with_exec_info_шт_кгду() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups 🙃

@buchgr
Copy link
Contributor

buchgr commented Jun 13, 2019

This is what we've agreed on in the design doc - propagate only a fixed set of tags for now. If we apply the same filter as for execution_info, more tags will be propagated.

Two things:

  1. Independent of what's in the design doc if it makes no sense we shouldn't do it :).
  2. Also, in the actual implementation we can always be more relaxed and still fulfill the contract of the spec.

The question to answer before we move forward with this implementation is why does it need to be a whitelist and can't just be the existing filters?

@ishikhman
Copy link
Contributor Author

This is what we've agreed on in the design doc - propagate only a fixed set of tags for now. If we apply the same filter as for execution_info, more tags will be propagated.

Two things:

  1. Independent of what's in the design doc if it makes no sense we shouldn't do it :).
  2. Also, in the actual implementation we can always be more relaxed and still fulfill the contract of the spec.

Of course, and I'm not arguing with this :)
I feel a bit confused though because we have already discussed exactly the same issue during the design doc review and I had an impression that we reached an agreement. Apparently that is not the case :) Anyways, I'm open to further discussion, maybe we will come up with a better solution.

The question to answer before we move forward with this implementation is why does it need to be a whitelist and can't just be the existing filters?

The initial idea was to propagate all* tags, but it's not clear what to do in case of conflicts. Currently there is only one example of a conflict requires-network and block-network, but if we allow 'requires-' and 'block-' we will get more conflicts in the future, it's just a matter of time.
If we do not take care of these conflicts and just propagate all the tags via the existing filters, we are potentially introducing unpredictable behavior.

Theoretically, we could just use the current filter and in case of conflicts: 1) do nothing and introduce unpredictable behavior, as mentioned above; 2) allow tags to override exec_requirements declared on rules; 3) rise an exception. And it is not obvious which one to choose or whether we should choose it now. Therefore I got an idea to propagate only the tags we are sure about.

Why do you think that whiltelist is so much worse that the existing filters?

*all - something similar to the current execution info filter

@buchgr
Copy link
Contributor

buchgr commented Jun 14, 2019

I feel a bit confused though because we have already discussed exactly the same issue during the design doc review and I had an impression that we reached an agreement. Apparently that is not the case :)

Probably but I don't recall and it doesn't really matter - I reserve the right to change my opinion :). I am happy for us to go with a whitelist in the face of good arguments for it but I haven't seen any so far.

If we do not take care of these conflicts and just propagate all the tags via the existing filters, we are potentially introducing unpredictable behavior.

  1. The example of "block-network" and "requires-network" is a purely hypothetical example. Have we seen this causing trouble in the past?
  2. "block-network" and "requires-network" is pretty much only useful for tests and tests have been passing these tags to execution requirements since the very beginning and I am not aware of this being an issue for anyone.
  3. It should be up to the execution strategy to decide what it does with conflicting requirements not the propagation logic.

Why do you think that whiltelist is so much worse that the existing filters?

  1. It's error prone. If an execution strategy adds a new execution requirement it's easy to miss that this list also has to be updated.
  2. If there was a whitelist this would be the wrong place to add it. An execution strategy should define what it accepts.

Currently there is only one example of a conflict requires-network and block-network, but if we allow 'requires-' and 'block-' we will get more conflicts in the future, it's just a matter of time.

Looking at the change history of ExecutionRequirements.java does not back this claim up.

@ishikhman
Copy link
Contributor Author

Probably but I don't recall and it doesn't really matter - I reserve the right to change my opinion :). I am happy for us to go with a whitelist in the face of good arguments for it but I haven't seen any so far.

okay, then why do we need the design doc discussion-approval process at all? :)

  1. The example of "block-network" and "requires-network" is a purely hypothetical example. Have we seen this causing trouble in the past?
  2. "block-network" and "requires-network" is pretty much only useful for tests and tests have been passing these tags to execution requirements since the very beginning and I am not aware of this being an issue for anyone.
  3. It should be up to the execution strategy to decide what it does with conflicting requirements not the propagation logic.

and

It's error prone. If an execution strategy adds a new execution requirement it's easy to miss that this list also has to be updated.

If an execution strategy adds a new execution requirement it's easy to miss that there is a potential conflict.

Looking at the change history of ExecutionRequirements.java does not back this claim up.

Do you mean that it hasn't changed recently/often? This doesn't mean that it won't :)

@buchgr
Copy link
Contributor

buchgr commented Jun 17, 2019

I think it would be more productive to put forward the argument in favor of having a white list.

@ishikhman
Copy link
Contributor Author

Ok, back to the discussion.

Pros(+) and cons(-) for both options:

Whitelisting:
(-) requires white-list updated for every new exec requirement, that should be propagated

FIX: I can just a simple tests that would fail for every new execution requirement => the person who added it would need to decide - to propagate it or not.

(+) easy to implement
(+) no need in conflicts resolution

More generic filtering:
(-) potential conflicts
If an execution strategy adds a new execution requirement it's easy to miss that there is a potential conflict.

** Potential FIX**: add a test that would check new exec requirements for potential conflicts => the person who added it would need to decide what to do with it. + would be nice to enforce a proper documentation on this. For example, as for block/requires-network, at the moment block-network always takes precedence over the second one.

(+) not that difficult to implement either
(+) no need to update the list for every new exec requirement

Suggestion
I am not a big fan of white-listing as well, but if feels much easier and safer cause we have a better control over what is propagated and we do not introduce any (even potential) conflicts.

Do you have any other arguments that I've missed?

@buchgr
Copy link
Contributor

buchgr commented Jun 17, 2019

As I wrote above the current behavior is to forward both tags as execution requirements:

sh_test
  name = "foo",
  tags = ["requires-network", "blocks-network"],
)

We have code to deal with this https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/actions/Spawns.java;l=44?q=Spawns.java. "blocks-network" takes precedence.

We'll not introduce different behavior for test rules and build rules.

@ishikhman
Copy link
Contributor Author

As I wrote above the current behavior is to forward both tags as execution requirements:

sh_test
  name = "foo",
  tags = ["requires-network", "blocks-network"],
)

We have code to deal with this https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/actions/Spawns.java;l=44?q=Spawns.java. "blocks-network" takes precedence.

We'll not introduce different behavior for test rules and build rules.

Yes, I have seen this and mentioned in my previous comment that we deal with this couple of exec requirements already.

What I am trying to say is that we will not add any mechanism to prevent potential conflicts in the future, while will open the door to it.

Okay, I am still not convinced, but this discussion takes too long already. I will switch to the current filter and will try to think of a way to prevent potential conflicts. If it will take more that 1-2 hours, I will just add a comment to the ExecutionRequirements class, so that future developers are aware of the potential problems.

@ishikhman ishikhman requested a review from buchgr June 17, 2019 12:21
@ishikhman ishikhman requested a review from hlopko June 17, 2019 12:21
Copy link
Contributor

@buchgr buchgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

"tests/BUILD",
"sh_binary(name = 'with-prefix-block', srcs=['sh.sh'], tags=['block-some-feature', 'block-network', 'wrong-tag'])",
"sh_binary(name = 'with-prefix-cpu', srcs=['sh.sh'], tags=['cpu:123', 'wrong-tag'])",
"sh_binary(name = 'with-local-tag', srcs=['sh.sh'], tags=['local', 'some-tag'])"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the above two lines. they are not used in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks :)

@ishikhman
Copy link
Contributor Author

ishikhman commented Jun 18, 2019

@laurentlb @hlopko please let me know your thought. This is a short summary of what happened:

  • I added tags propagation not exactly how we agreed in the design doc, instead of while list I re-used already existing filter that has already been used to filter execution_requirements coming from rules (and tags, but only in some cases)
  • the case that I initially thought to be a conflict (--block_network vs --requires_network) appeared not to be a problem, as this situation is taken care of at the place of usage
  • I have added both unit tests and integration tests

Therefore it is safe to assume that we can re-use existing filter, as it was introduced with the same purpose - filter tag that should be propagated to the actions.

Please let me know your thoughts, as I'd like to merge this change before the Summit next week :)

@irengrig irengrig added WIP and removed WIP labels Jun 19, 2019
@ishikhman
Copy link
Contributor Author

@laurentlb @hlopko friendly ping :) Please let me know whether you are interested in looking into this change. If not - I'll just merge it.

@laurentlb
Copy link
Contributor

Based on your last summary, LGTM

(sorry for the delay!)

@bazel-io bazel-io closed this in 5f53ab6 Jul 3, 2019
siberex pushed a commit to siberex/bazel that referenced this pull request Jul 4, 2019
Tags declared on targets are not propagated to actions and therefore are not taken into consideration by bazel. This causes some issues, for instance, target marked with a tag 'no-remote' will still be executed remotely.
As it was agreed in the design doc (see [doc](https://docs.google.com/document/d/1X2GtuuNT6UqYYOK5lJWQEdPjAgsbdB3nFjjmjso-XHo/edit#heading=h.5mcn15i0e1ch) and bazelbuild#7766 for details), set of tags to be propagated to actions as a first iteration.
This change is responsible for that first step for the Starlark Rules.

RELNOTES: tags 'no-remote', 'no-cache', 'no-remote-cache', 'no-remote-exec', 'no-sandbox' are propagated now to the actions from targets.

Closes bazelbuild#7766

Closes bazelbuild#8612.

PiperOrigin-RevId: 256369636
irengrig pushed a commit to irengrig/bazel that referenced this pull request Jul 15, 2019
Tags declared on targets are not propagated to actions and therefore are not taken into consideration by bazel. This causes some issues, for instance, target marked with a tag 'no-remote' will still be executed remotely.
As it was agreed in the design doc (see [doc](https://docs.google.com/document/d/1X2GtuuNT6UqYYOK5lJWQEdPjAgsbdB3nFjjmjso-XHo/edit#heading=h.5mcn15i0e1ch) and bazelbuild#7766 for details), set of tags to be propagated to actions as a first iteration.
This change is responsible for that first step for the Starlark Rules.

RELNOTES: tags 'no-remote', 'no-cache', 'no-remote-cache', 'no-remote-exec', 'no-sandbox' are propagated now to the actions from targets.

Closes bazelbuild#7766

Closes bazelbuild#8612.

PiperOrigin-RevId: 256369636
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tags to execution requirements propagation in Starlark
7 participants