Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition container image target platform #1963

Merged
merged 1 commit into from
Dec 17, 2021

Conversation

uhthomas
Copy link
Collaborator

Transition to the target platform associated with the architecture and
operating_system attributes on container image rules.

This change allows for container image rules to build the correct binary for
the target platform, regardless of the host platform. Container image rules
would require the use of the target_compatible_with attribute to prevent
mismatching host and target platforms building dependencies incorrectly.
Additionally, hosts which did not match the target platform had to explicitly
specify the target platform with the --platforms command-line option.

This change fixes the aforementioned issues and
#690.

Massive thank you to @joneshf for the initial source. It has been adapted to
automatically select the target platform associated with the container image,
as opposed to always using @io_bazel_rules_go//go/toolchain:linux_amd64.

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

Issue Number: #690

container_image rules do not transition the target platform.

What is the new behavior?

container_image rules transition the target platform associated with the architecture and operating_system attributes.

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

@google-cla google-cla bot added the cla: yes label Nov 21, 2021
@uhthomas uhthomas force-pushed the master branch 2 times, most recently from 7cc6c1d to d889a1e Compare November 22, 2021 00:53
@uhthomas
Copy link
Collaborator Author

uhthomas commented Nov 22, 2021

I noticed that unlike jar_dep_layer; jar_app_layer, _war_app_layer and _war_dep_layer use _container.image instead of lang_image. Is this on purpose? This difference means that at current, java_image rule will not inherit this change.

@alexeagle
Copy link
Collaborator

@smukherj1 do you have any answer to that question about why Java is different in this regard?
@uhthomas are there scenarios where it matters? .jar files are supposedly "portable" - I guess JNI that depends on a C++ library (SWIG?) would require the transition?

@uhthomas
Copy link
Collaborator Author

@alexeagle Thanks for taking a look! I'm not too familiar with the java image rules, java rules or cross-compilation of java. I'm not sure what effect changing the target platform would have, if any. A cursory Google shows that Java may sometimes depend on C as you suggested. As such, even if the JARs are portable, the C would still need to be built for the correct target platform.

Copy link
Collaborator

@alexeagle alexeagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fantastic. Does it mean users no longer need to specify a --platform on the command line when building an image? I think some documentation update may be required too.

Maintainers here don't know much about transitions, we'll try to get someone from the bazel team to help verify.

lang/image.bzl Outdated Show resolved Hide resolved
lang/image.bzl Outdated Show resolved Hide resolved
@uhthomas uhthomas force-pushed the master branch 8 times, most recently from ecf7f7e to a76c92d Compare November 23, 2021 01:50
@uhthomas
Copy link
Collaborator Author

Thanks @alexeagle! You're absolutely correct, explicitly targeting platforms with the --platforms flag is no longer required.

I think you're right that some documentation updates may be in order. I'll take a look tomorrow. 😄

@uhthomas uhthomas force-pushed the master branch 2 times, most recently from fd04391 to 73964b2 Compare November 23, 2021 02:12
@ahmedkirasystems
Copy link

ahmedkirasystems commented Nov 25, 2021

Hey @uhthomas I tried consuming rules from your repo.


http_archive(
    name = "io_bazel_rules_docker",
    sha256 = "62b79a11bae076a8acd97749a06e5b8448c793ec5a730aea925e68c6f46dcaff",
    strip_prefix = "rules_docker-73964b225c38052bbfb1ea556bfb49d2bd2edc77",
    urls = ["https://github.com/uhthomas/rules_docker/archive/73964b225c38052bbfb1ea556bfb49d2bd2edc77.tar.gz"],
)

I'm using the same command which runs when using 0.21.0 version.

bazel run //:app

I get error related to bazelbuild/bazel#10134

INFO: Invocation ID: db25a6c0-c7e1-4a87-8071-d0548bc83ac1
ERROR: While resolving toolchains for target @io_bazel_rules_go//:cgo_context_data: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
ERROR: Analysis of target '//:app' failed; build aborted: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
INFO: Elapsed time: 1.225s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (72 packages loaded, 549 targets configured)
FAILED: Build did NOT complete successfully (72 packages loaded, 549 targets configured)
    Fetching @bazel_gazelle_go_repository_config; Restarting.

I'm using container_image, which from the conversations above, I get the feeling that it should work.

@uhthomas
Copy link
Collaborator Author

Thank you for trying out these changes @ahmedkirasystems. You're completely right and I feel silly for not testing this on a more minimal repo which doesn't have additional CC toolchains.

I cannot reproduce this with a minimal container_image:

container_image(name = "some_image")

but I can reproduce this with go_image:

go_image(
    name = "some_image",
    embed = [":some_lib"],
)

go_image targets typically have their goarch and goos attributes configured which implicitly disable cgo. This does not happen when using the image rule transition.

This behaviour can be observed with rules_docker v0.21.0 and the following go_image targets. On a MacOS host, the pure = "off" build fails with the same error you observed:

# fail
go_image(
    name = "some_image",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
    pure = "off",
)

# ok
go_image(
    name = "some_image",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
    pure = "on",
)

# ok 
go_image(
    name = "some_image",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
    pure = "auto",
)

# ok (implicit `pure = "auto"`)
go_image(
    name = "some_image",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
)

This seems like some weird behaviour on rules_go's part honestly, but does mostly fall inline with the behaviour of the native Go toolchain. I think the only exception is that the native Go toolchain does not implicitly disable cgo when goarch and goos are the same as the host.

Side note: I spent a little bit trying to understand why both pure = "auto" and pure = "on" built okay, but the latter would fail to execute with standard_init_linux.go:228: exec user process caused: no such file or directory. Turns out I had --@io_bazel_rules_go//go/config:linkmode=pie in my .bazelrc. Not sure why this broke things, but thought it was worth noting.

I was going to suggest that this could be fixed by setting pure = "on" but rules_go still seems to want to use C? Explicitly configuring goarch and goos again properly disables C. I imagine this is because the pure attribute is a little bit misleading, such cgo is actually controlled elsewhere (and internal_cgo_off) via platforms.

So I think the correct fix is either transition @io_bazel_rules_go//go/toolchain:cgo_constraint directly, or use the native Go platform if the downstream rule is go_image.


Sorry, I understand this comment is quite long.

TL;DR: The go_image rule needs to replicate the original behaviour of force disabling cgo by default. I'll think about the best way to do this, any input would be really appreciated.

@uhthomas
Copy link
Collaborator Author

I wonder if there's some sort of subtle bug somewhere in rules_go, or if the documentation is wrong. Per https://github.com/bazelbuild/rules_go/blob/master/go/core.rst#go_binary:

If auto, pure mode is enabled when no C/C++ toolchain is configured or when cross-compiling.

So, I'm not sure why pure mode is not enabled when there is no C/C++ toolchain.

@uhthomas uhthomas force-pushed the master branch 3 times, most recently from 47c404c to 719cfdf Compare November 26, 2021 23:32
@alexeagle
Copy link
Collaborator

rules_docker is hard to maintain due to all the languages being supported in this repo! I don't know the answer for this Go transition question either. Does it make sense to cut scope of this PR to cover everything except Go? Or else maybe we need another maintainer like @pcj (or a rules_go maintainer like @achew22 or @steeve ) to advise why the auto-selection of pure mode isn't working under transitions.

@uhthomas
Copy link
Collaborator Author

uhthomas commented Nov 27, 2021

I was trialing some changes which would configure cgo on behalf of rules_go, but I think that's wrong and so I'm going to revert it. I think there's a bug in rules_go somewhere as pure has absolutely no effect at all when using platforms instead of goarch and goos. This behaviour can be observed by targeting a custom linux/amd64 platform from MacOS:

bazel build --platforms=//:linux_x86_64 //:some_bin
platform(
    name = "linux_x86_64",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:linux",
    ],
)

# fail
go_binary(
    name = "some_bin",
    embed = [":some_lib"],
    visibility = ["//visibility:public"],
)

# fail
go_binary(
    name = "some_bin",
    embed = [":some_lib"],
    pure = "off",
    visibility = ["//visibility:public"],
)

# fail
go_binary(
    name = "some_bin",
    embed = [":some_lib"],
    pure = "on",
    visibility = ["//visibility:public"],
)

# ok
go_binary(
    name = "some_bin",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
    visibility = ["//visibility:public"],
)

# ok
go_binary(
    name = "some_bin",
    embed = [":some_lib"],
    goarch = "amd64",
    goos = "linux",
    pure = "on",
    visibility = ["//visibility:public"],
)

@lazcamus
Copy link

Is it possible to update the release notes for 0.23 to include "this release is not expected to work on Mac OSX" under the "Breaking Changes" section @gravypod ?

@alexeagle
Copy link
Collaborator

@uhthomas what about the config option to disable transitions? #1963 (comment)
--@io_bazel_rules_docker//transitions:enable=false

@fmeum
Copy link

fmeum commented Apr 8, 2022

@uhthomas In the (probably very common) case of a Linux x86_64 image built as part of a build with target platform Linux x86_64, the transition to a synthetic platform will change the configuration and thus cause all dependencies of the image to be rebuilt. Do you think that it would be feasible to move the configuration change from an incoming to an outgoing dependency edge, perhaps by wrapping the image rules in a macro that applies a forwarding rule? This would make it possible to make the transition a no-op if CPU and OS are already at the expected values.

@uhthomas
Copy link
Collaborator Author

uhthomas commented Apr 8, 2022

@uhthomas what about the config option to disable transitions? #1963 (comment) --@io_bazel_rules_docker//transitions:enable=false

Sounds like a reasonable suggestion. I can cut a PR for that soon.

@uhthomas
Copy link
Collaborator Author

uhthomas commented Apr 8, 2022

@uhthomas In the (probably very common) case of a Linux x86_64 image built as part of a build with target platform Linux x86_64, the transition to a synthetic platform will change the configuration and thus cause all dependencies of the image to be rebuilt. Do you think that it would be feasible to move the configuration change from an incoming to an outgoing dependency edge, perhaps by wrapping the image rules in a macro that applies a forwarding rule? This would make it possible to make the transition a no-op if CPU and OS are already at the expected values.

That's an interesting idea, I like it. Would you be able to help me understand how the transition would become no-op by switching to an outgoing dependency edge?

@fmeum
Copy link

fmeum commented Apr 8, 2022

@uhthomas I've thought more about it and now see two ways to prevent rebuilds:

  1. Let users specify a string_dict mapping of CPU/OS pairs to the label of the platform the user would like to use for that pair. Then, instead of using the synthetic platform, set platforms to the singleton lost containing the specified platform in the transition (if one is set in the dict). This should result in the transition being a no-op when the platform doesn't change.

  2. Apply the transition on the outgoing edges of every image rule. This allows the transition to access the values of attributes specified via select expressions, which currently isn't supported for transitions on incoming edges (Transitions can inspect configurable attributes, but don't see a value for them bazel#15157).
    Then add common string attributes current_os and current_cpu to these rules and populate them in a wrapper macro using select on @platforms//os:{cpu,os} to translate the constraint_values into string. In the transition, return an empty dict if the image's os and cpu match the current ones read from the attrs.

Happy to provide more details and think about further alternatives.

laurynaslubys added a commit to laurynaslubys/rules_docker that referenced this pull request Apr 24, 2022
Transitions, introduced in bazelbuild#1963 can now be disabled by calling bazel with `--@io_bazel_rules_docker//transitions:enable=no`.
Added tests to verify the behaviour.
@laurynaslubys laurynaslubys mentioned this pull request Apr 24, 2022
12 tasks
laurynaslubys added a commit to laurynaslubys/rules_docker that referenced this pull request Apr 24, 2022
Transitions, introduced in bazelbuild#1963 can now be disabled by calling bazel with `--@io_bazel_rules_docker//transitions:enable=no`.
Added tests to verify the behaviour.
laurynaslubys added a commit to laurynaslubys/rules_docker that referenced this pull request Apr 24, 2022
Transitions, introduced in bazelbuild#1963 can now be disabled by calling bazel with `--@io_bazel_rules_docker//transitions:enable=no`.
Added tests to verify the behaviour.
@uhthomas
Copy link
Collaborator Author

uhthomas commented May 5, 2022

I think it's a good idea to merge #2068. The next LTS release of Bazel will include optional toolchains, which should resolve this whole thing altogether bazelbuild/bazel#3601 (comment).

@uri-canva
Copy link
Contributor

The next LTS release is still half a year away isn't it?

@linzhp linzhp mentioned this pull request May 6, 2022
10 tasks
uhthomas pushed a commit that referenced this pull request May 23, 2022
* Make transitions optional

Transitions, introduced in #1963 can now be disabled by calling bazel with `--@io_bazel_rules_docker//transitions:enable=no`.
Added tests to verify the behaviour.

* Update ubuntu image to try to bust the bazel cache
@thefallentree
Copy link

the suggestion --@io_bazel_rules_docker//transitions:enable=false doesn't really work , bazel complains ```ERROR: @io_bazel_rules_docker//transitions:enable :: Error loading option @io_bazel_rules_docker//transitions:enable: no such package '@io_bazel_rules_docker//transitions': BUILD file not found in directory 'transitions' of external repository @io_bazel_rules_docker. Add a BUILD file to a directory to mark it as a package.

@uhthomas
Copy link
Collaborator Author

uhthomas commented Jun 7, 2022

@thefallentree which version of rules_docker are you using? (including sha256)

@thefallentree
Copy link

thefallentree commented Jun 8, 2022

@thefallentree which version of rules_docker are you using? (including sha256)

I've been using this.

     http_archive(
         name = "io_bazel_rules_docker",
         sha256 = "85ffff62a4c22a74dbd98d05da6cf40f497344b3dbf1e1ab0a37ab2a1a6ca014",
         strip_prefix = "rules_docker-0.23.0",
         urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.23.0/rules_docker-v0.23.0.tar.gz"],
     )

which doesn't work, so I have been stuck on 0.22.0. instead.

And just to clarify , 0.23.0 doesn't work for me because of java_image rule is complaining about unresolved c++ toolchain, so I'm not sure how that's possible, do you have any more information about it?

@thefallentree
Copy link

For anyone landed here:

Now with rules_docker-0.25.0 new release

you can now use --@io_bazel_rules_docker//transitions:enable=false

and it should work!

@cfife-btig
Copy link

I don't know what other people's use cases are - it sounds like a lot of people wanted a disable transitions flag because of an error with go images in macOS.

For us, we are having some inconveniences because as I believe @fmeum has said, the build configurations for images are different which is causing us to have to reproduce all of the build steps anytime we build an image, even if I had already built all the prior steps.

Since we don't have any special build targets, I went ahead and tried using the new disable transitions flag, but I don't think this works at least for my use case. Images still have a different build configuration and makes me run my pipeline from scratch. Maybe optional toolchaining or something like what @fmeum suggested would be better. Or maybe this is an issue with the disable transitions flag.

In the meantime, my team has not felt like the latest releases have caused more slowness then help, so we've just been staying at 0.22 rules_docker since we don't have any special build targets.

St0rmingBr4in pushed a commit to St0rmingBr4in/rules_docker that referenced this pull request Oct 17, 2022
* Make transitions optional

Transitions, introduced in bazelbuild#1963 can now be disabled by calling bazel with `--@io_bazel_rules_docker//transitions:enable=no`.
Added tests to verify the behaviour.

* Update ubuntu image to try to bust the bazel cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.