Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 0.26.0 fixed the autoconfigured C++ toolchains and now break cross-compilation of Go #2089

Closed
hlopko opened this issue Jun 4, 2019 · 52 comments

Comments

@hlopko
Copy link
Contributor

hlopko commented Jun 4, 2019

Hello :)

With the fix for bazelbuild/bazel#8330 we discovered one flaw in our design of Go cross compilation. Currently rules_go depend on C++ toolchain in //go/private:context.bzl (toolchains = ["@bazel_tools//tools/cpp:toolchain_type"]). This dependency exists no matter if we actually have C++ dependencies or not.

The problem is that with bazelbuild/bazel#8330 C++ autoconfigured toolchains properly report their constraints, and now Bazel detects that we don't actually have a C++ cross-compilation toolchain available. As a result the cross-compilation build of go fails:

ERROR: While resolving toolchains for target @io_bazel_rules_go//:go_context_data: no matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type
ERROR: Analysis of target '//:main' failed; build aborted: no matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type

To reproduce, create a hello world go_binary, and build it with bazel build //:hello_world --platforms=@io_bazel_rules_go//go/toolchain:darwin_amd64 on linux, (or if you're on mac, use a different platform).

@hlopko
Copy link
Contributor Author

hlopko commented Jun 4, 2019

CC @katre, @philwo

@jayconrod
Copy link
Contributor

@hlopko @katre Do you have a solution for this in mind already? Is there any way to make a toolchain dependency optional?

I think we could define a rule that requires the C++ toolchain. Go rules could depend on that target or not via a select. WDYT?

@hlopko
Copy link
Contributor Author

hlopko commented Jun 4, 2019

I didn't have time to think about it too much.

I guess the simple workaround for the meantime is to revert dd527c7 (I didn't try it yet but it should help).

The separate rule approach is worth thinking about. What will you select on? Ideally you want to take C++ toolchain if its present...

@lberki had a nice evil idea - have an aspect traverse go dependencies, and if it finds C++ target, it steals its C++ toolchain from it. There's a lot of handwaving there :))

@hlopko
Copy link
Contributor Author

hlopko commented Jun 4, 2019

Another not great alternative is to register Null-object-like C++ toolchains that can only fail the build when actually used.

@jayconrod
Copy link
Contributor

Would reverting dd527c7 break cgo compilation in general? Is that required in Bazel 0.26.0, or still optional?

About selecting on a separate rule, I think this will be easier once SBC is fully enabled. We'll probably have a control whether cgo is enabled, and we could select on that.

Another idea: we could define a separate set of go_toolchains that depend on the C++ toolchain, then add constraint_setting / constraint_values and distinct cgo platforms. Rules would only require the Go toolchain directly.

@lberki
Copy link
Contributor

lberki commented Jun 5, 2019

The "null toolchain" would only help if you could make it so that it's selected iff nothing else is. I thought that was not possible.

@hlopko
Copy link
Contributor Author

hlopko commented Jun 5, 2019

Reverting dd527c7 will make go cross compilation work if there are no C++ dependencies. It will not make cgo with C++ dependencies work, cgo would only cross compile Go code, not the C++ code.

Having a cgo option is a solution. This way users have to specify that they intend to use cgo and go rules will only then depend on C++ toolchain (and complain when cross compiling that the matching C++ toolchain is not present). I think SBC is non-experimental in 0.27, but we should double check.

The last approach you propose is to have distinct sets of platforms for go and cgo, right? That will work as well, it's similar to the previous solution, except that we select on constraint_value, not on configuration option.

Is that reasonable thing to ask from Go users to be explicit about cgo usage? Or they expect this to work automatically?

@katre
Copy link
Contributor

katre commented Jun 5, 2019 via email

@lberki
Copy link
Contributor

lberki commented Jun 5, 2019

Shouting from sidelines: I could imagine conditional toolchains (i.e. "I need this toolchain in some cases, but not in others") but optional (as in "it's useful if I have it, but I can deal with it it's not available") is weird.

Let's try to keep the user-facing API simple because the API is something everyone who uses toolchains needs to understand and they don't benefit from something added only for Go. If Go is the only place where a new API feature would be useful, I'd prefer the hackish "fish the C++ toolchain out from transitive dependencies" approach than extending the API. That way, it's only Go who pays the price.

@katre
Copy link
Contributor

katre commented Jun 5, 2019 via email

@jayconrod
Copy link
Contributor

Is that reasonable thing to ask from Go users to be explicit about cgo usage? Or they expect this to work automatically?

I think it's reasonable. With the Go command, cgo is enabled by default if the target platform is the same as the host platform and disabled otherwise. Users can explicitly turn it on by setting CGO_ENABLED=1 in their environment.

So here's what I'm thinking now:

  • rules_go would define a cgo constraint_setting with two values: cgo_on and cgo_off. The default is cgo_on.
  • Platforms defined in rules_go will have two variants, one for each cgo mode. So you could build with --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64_cgo to get cross-compilation with cgo, and --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 without cgo.
    • This shouldn't break anyone. Currently, all Go toolchains require cgo, but we don't enable it internally by default when cross-compiling since people almost never have a working cross-compiling C/C++ toolchain set up.
  • We'll define a toolchain for each platform. Each cgo_on variant will depend on a target that requests a @bazel_tools//tools/cpp:toolchain_type toolchain. Each cgo_off variant will not depend on this target.

@hlopko
Copy link
Contributor Author

hlopko commented Jun 7, 2019

If CGO_ENABLED is how people configure cgo outside of Bazel, then let's do this!

@jayconrod
Copy link
Contributor

This should be fixed in #2090.

  • On the default platform, cgo is enabled.
  • With --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64, cgo is disabled, and no C/C++ toolchain is required. Previously, cgo was disabled, but we always required a C/C++ toolchain, even though we didn't use it.
  • With --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64_cgo, cgo is enabled, and a compatible C/C++ toolchain is required. These platforms are new.

cc @steeve

@hlopko Any update on the mingw constraint being required by the auto-configured C/C++ toolchains on Windows?

Unfortunately, I can't think of a way to get cgo to work on Windows by default without requiring users to configure their own toolchains or platform since MSVC is always selected. That was already a problem though (they had to say --cpu=x64_windows --compiler=mingw-gcc).

@asv
Copy link

asv commented Jun 18, 2019

I have problems with cross-compiling (windows_amd64 under linux) even in commit f2373c9...

bazel: 0.26.1
rules_go: f2373c9

$ bazel build --platforms=@io_bazel_rules_go//go/toolchain:windows_amd64 //cmd/...
ERROR: /home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/wkt/BUILD.bazel:3:1: in go_proto_library rule @io_bazel_rules_go//proto/wkt:wrappers_go_proto: 
Traceback (most recent call last):
        File "/home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/wkt/BUILD.bazel", line 3
                go_proto_library(name = 'wrappers_go_proto')
        File "/home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/def.bzl", line 116, in _go_proto_library_impl
                go_srcs.extend(compiler.compile(go, compiler = ...))
        File "/home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/def.bzl", line 116, in go_srcs.extend
                compiler.compile(go, compiler = compiler, protos = [g...], <2 more arguments>)
        File "/home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/compiler.bzl", line 72, in compiler.compile
                args.add("-compiler_path", go.cgo_tools.c_c...])
        File "/home/asmirnov/.cache/bazel/_bazel_asmirnov/716d67f5e74ab6acaf0df7cd1e3df431/external/io_bazel_rules_go/proto/compiler.bzl", line 72, in args.add
                go.cgo_tools.c_compiler_path
object of type 'NoneType' has no field 'c_compiler_path'

Is there a working workaround, if I do not need support cgo?

@jayconrod jayconrod reopened this Jun 18, 2019
@jayconrod
Copy link
Contributor

@asv We shouldn't need the C compiler to run protoc. I'll look into that. We're using its path to set a value for PATH, but I don't think that actually makes sense.

--platforms=@io_bazel_rules_go//go/toolchain:windows_amd64_cgo may work if you have configured a working C/C++ cross compiler.

jayconrod pushed a commit to jayconrod/rules_go that referenced this issue Jun 18, 2019
The C compiler was used to set PATH. No idea why that was useful though.

Fixes bazel-contrib#2089
@steeve
Copy link
Contributor

steeve commented Jun 19, 2019

I have the same problem as #2089 (comment) when I'm using a starlark transition to set the correct platform based on the current cpu (where go is a dependency of a ios_framework or android_binary that's multiarch).

My transition:

_ANDROID_CPUS_TO_PLATFORMS = {
    "x86": "@io_bazel_rules_go//go/toolchain:android_386_cgo",
    "x86_64": "@io_bazel_rules_go//go/toolchain:android_amd64_cgo",
    "armeabi-v7a": "@io_bazel_rules_go//go/toolchain:android_arm_cgo",
    "arm64-v8a": "@io_bazel_rules_go//go/toolchain:android_arm64_cgo",
}

_IOS_CPUS_TO_PLATFORMS = {
    "ios_armv7": "@io_bazel_rules_go//go/toolchain:ios_arm_cgo",
    "ios_arm64": "@io_bazel_rules_go//go/toolchain:ios_arm64_cgo",
    "ios_i386": "@io_bazel_rules_go//go/toolchain:ios_386_cgo",
    "ios_x86_64": "@io_bazel_rules_go//go/toolchain:ios_amd64_cgo",
}

def _go_platform_transition_impl(settings, attr):
    cpu = settings["//command_line_option:cpu"]
    platform = ""
    if settings["//command_line_option:crosstool_top"].workspace_name == "androidndk":
        platform = _ANDROID_CPUS_TO_PLATFORMS[cpu]
    elif cpu in _IOS_CPUS_TO_PLATFORMS:
        platform = _IOS_CPUS_TO_PLATFORMS[cpu]
    return {
        "//command_line_option:platforms": platform,
    }

go_platform_transition = transition(
    implementation = _go_platform_transition_impl,
    inputs = [
        "//command_line_option:cpu",
        "//command_line_option:crosstool_top",
    ],
    outputs = [
        "//command_line_option:platforms",
    ],
)

On android the error is somewhat different:

ERROR: While resolving toolchains for target //examples/helloworld:HelloLib.java.cc_library: no matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type

@steeve
Copy link
Contributor

steeve commented Jun 19, 2019

If I enable legacy objc/cgo mode:

ERROR: /Users/steeve/go/src/github.com/znly/rules_gomobile/examples/helloworld/BUILD.bazel:13:1: in _cgo_codegen rule //examples/helloworld:HelloLib.objc.binary%js_wasm%cgo_codegen:
Traceback (most recent call last):
	File "/Users/steeve/go/src/github.com/znly/rules_gomobile/examples/helloworld/BUILD.bazel", line 13
		_cgo_codegen(name = 'HelloLib.objc.binary%js_wasm%cgo_codegen')
	File "/private/var/tmp/_bazel_steeve/7a2b891e90b0ccfacc379ac7b2b2ca04/external/io_bazel_rules_go/go/private/rules/cgo.bzl", line 269, in _cgo_codegen_impl
		fail("Go toolchain does not support c...")
Go toolchain does not support cgo

@jayconrod
Copy link
Contributor

#2101 will remove the go_proto_compiler dependency on the C/C++ toolchain. I don't think we ever really needed it.

However, I'm not able to build anything proto-related on Windows without MSVC, including proto_library. The mingw toolchain doesn't get picked up, even with --host_platform set to a toolchain with the mingw constraint. Not really sure how to proceed. @hlopko any ideas?

@steeve
Copy link
Contributor

steeve commented Jul 16, 2019

@hlopko I could get away with manually registering them, though:

[
    toolchain(
        name = "android_%s" % cpu,
        target_compatible_with = [
            "@bazel_tools//platforms:android",
            "@bazel_tools//platforms:%s" % cpu,
        ],
        toolchain = "@androidndk//:%s-linux-android-clang7.0.2-libcpp" % cpu,
        toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    )
    for cpu in ["x86", "x86_64", "arm", "aarch64"]
]
[
    register_toolchains("//:android_%s" % cpu)
    for cpu in ["x86_64", "arm", "aarch64"]
]

Is there anything more needed? I can compile cc_* rules just fine with --incompatible_enable_cc_toolchain_resolution.

@hlopko
Copy link
Contributor Author

hlopko commented Jul 16, 2019

Registering ndk toolchains is one of the problems. Other is having the fat apk produced correctly.

@steeve
Copy link
Contributor

steeve commented Jul 16, 2019

@hlopko the problem though is that it seems if we put a --platform, then the crosstool_top is ignored (and I guess the transition from android_crosstool_top as well)

@Toxicable
Copy link

Toxicable commented Oct 10, 2019

I believe this causes and issue for people using rules_docker

We're upgrading rules_docker to 0.12.0 which now depends on rules_go and we're getting this error in CI:

ERROR: /home/runner/.cache/bazel/_bazel_runner/f2e96da83c9a9bca36350376aeb4df02/external/io_bazel_rules_go/BUILD.bazel:80:1: Target '@io_bazel_rules_go//:cgo_context_data' depends on toolchain '@local_config_cc//:cc-compiler-k8', which cannot be found: error loading package '@local_config_cc//': Unable to load file '@local_config_cc//:armeabi_cc_toolchain_config.bzl': file doesn't exist'
ERROR: Analysis of target '//apps/APPNAME/be:image' failed; build aborted: error loading package '@local_config_cc//': Unable to load file '@local_config_cc//:armeabi_cc_toolchain_config.bzl': file doesn't exist

We don't need cgo or cross compiling of any C/C++
Is there a way to disable this?
I don't know much about how this is configured, but from reading through here I set: build --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 however that still requires that local_config_cc be configured as the error above shows.

@jayconrod
Copy link
Contributor

Just to recap where we are with this issue:

  • In some configurations, rules_go needs a C/C++ toolchain in order to build cgo code. Even if there is no cgo code, a C linker is needed in some configurations. The Go standard library also has some optional C dependencies.
  • So what rules_go really needs is an optional C/C++ toolchain dependency that's only provided in some configurations. I've attempted to do that by requiring a C/C++ toolchain from @io_bazel_rules_go//:cgo_context_data, which is a dependency of @io_bazel_rules_go//:go_context_data, hidden behind a select expression which only picks it on platforms where cgo is enabled. go_context_data is a common dependency of all Go rules.
  • Bazel seems to analyze cgo_context_data even though it isn't required. bazel cquery --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 'somepath(@io_bazel_rules_go//tests/core/go_binary:hello, @io_bazel_rules_go//:cgo_context_data)' shows that there's no path through the configured target graph from an example binary rule to cgo_context_data. It's possible there's some dependency sneaking through the toolchains, or there might be a bug in Bazel.

I've spent several weeks analyzing this issue, and I've basically timeboxed it at this point. It should be fixed eventually but probably won't be any time soon. I'd recommend folks work around this by installing a C/C++ toolchain or configuring a dummy cc_toolchain.

@Toxicable
Copy link

@jayconrod We have bazel installation doc: https://docs.bazel.build/versions/master/install-ubuntu.html#step-1-install-required-packages but it fails with the same error.
I don't know how else to debug what system deps are required to work around this.
This is the bazel section of our ci container dockerfile, which has a ubuntu:18.04 base:

# Bazel
RUN apt-get update && apt-get -y install --no-install-recommends pkg-config zip g++ zlib1g-dev unzip python3 git \
&& yarn global add @bazel/bazel@1.0.0 \
&& bazel version

Something else that might be helpful is with: build --toolchain_resolution_debug we see this:

INFO: ToolchainResolution: Selected execution platform @local_config_platform//:host, type @io_bazel_rules_go//go:toolchain -> toolchain @go_sdk//:go_linux_amd64_cgo-impl

I can post the full log output, it's just that it's a couple of a hundred lines.
Is there a flag we can pass to force using the non cgo toolchain?

@jayconrod
Copy link
Contributor

@Toxicable Do you have gcc or clang available in the container? Are you able to build a minimal cc_binary?

When I built a container earlier today following those instructions, something pulled in build-essential, which is probably the package you need. --no-install-recommends may be causing problems.

@Toxicable
Copy link

Toxicable commented Oct 11, 2019

I removed --no-install-recommends
Which I now see build-essential in the list of packages installed, however I did try installing that seprately previously.

In CI

  • gcc --help works correctly
  • cc_binary works correctly (npx bazel run :hello -> prints out hello after compiling)

But still failing on this error:

ERROR: /home/runner/.cache/bazel/_bazel_runner/f2e96da83c9a9bca36350376aeb4df02/external/io_bazel_rules_go/BUILD.bazel:80:1: Target '@io_bazel_rules_go//:cgo_context_data' depends on toolchain '@local_config_cc//:cc-compiler-k8', which cannot be found: error loading package '@local_config_cc//': Unable to load file '@local_config_cc//:armeabi_cc_toolchain_config.bzl': file doesn't exist'

Note this part here:

Unable to load file '@local_config_cc//:armeabi_cc_toolchain_config.bzl': file doesn't exist'

Whys it looking for arm toolchain configs?

@jayconrod
Copy link
Contributor

@Toxicable rules_go doesn't load that file directly. Something in the Bazel toolchain auto-configuration is probably loading it. Is this in a freshly started container, or was build-essential installed after running a bazel command? If the latter, you may need to run bazel clean --expunge to get it to reconfigure the toolchain.

I don't know enough about your environment to help more than that. Feel free to open a new issue with enough information to reproduce the problem, and I can take a look.

@nlopezgi
Copy link
Contributor

thanks for the additional info @Toxicable , I think the error might be related to some recent toolchain changes we made in rules_docker.
Can you run
bazel query --output=label @local_config_cc//...
Also, can you try to use bazelbuild/rules_docker#1205 to see if that fixes your issue. Apologies for the problem. Toolchains are hard!

@Toxicable
Copy link

It's alright, thanks for taking the time to debug this.

Both locally and in CI I get:

f.wiles@fwiles:~/workspace/WORKSPACENAME$ npx bazel query --output=label @local_config_cc//...
INFO: Writing tracer profile to '/home/f.wiles/.cache/bazel/_bazel_f.wiles/4e95cbad76bfb4ed000904165db39268/command.profile.gz'
WARNING: Switched off --watchfs again... temporarily falling back to manually checking files for changes
@local_config_cc//:toolchain
@local_config_cc//:malloc
@local_config_cc//:cc_wrapper
@local_config_cc//:cc-compiler-k8
@local_config_cc//:local
@local_config_cc//:compiler_deps
@local_config_cc//:cc-compiler-armeabi-v7a
@local_config_cc//:stub_armeabi-v7a
@local_config_cc//:empty
Loading: 0 packages loaded

when trying to use your PR I get this error locally:

ERROR: An error occurred during the fetch of repository 'nodejs_base':
   Pull command failed: 2019/10/12 10:51:03 Running the Image Puller to pull images from a Docker Registry...
2019/10/12 10:51:04 Image pull was unsuccessful: reading image "index.docker.io/toxicable/debian9-curl@sha256:bddb0e543b4556a8b9583f7bad36370aaa1e81ebaa8b09ec771268ec15cc69b5": <empty transport.Error response>
 (/home/f.wiles/.cache/bazel/_bazel_f.wiles/4e95cbad76bfb4ed000904165db39268/external/go_puller_linux/file/downloaded -directory /home/f.wiles/.cache/bazel/_bazel_f.wiles/4e95cbad76bfb4ed000904165db39268/external/nodejs_base/image -os linux -os-version  -os-features  -architecture amd64 -variant  -features  -name index.docker.io/toxicable/debian9-curl@sha256:bddb0e543b4556a8b9583f7bad36370aaa1e81ebaa8b09ec771268ec15cc69b5)

Note that nodejs_base is not the default one from ruels_docker, it's one we defiend at index.docker.io/toxicable/debian9-curl

@nlopezgi
Copy link
Contributor

nlopezgi commented Oct 11, 2019

thanks @Toxicable , it looks like that PR solves the toolchain issue, let's open up a separate issue in rules_docker to talk about the issue you are having pulling that image.

@lubinszARM
Copy link
Contributor

Hi,
Any progress of this issue? I found the same problem.
@jayconrod @hlopko

@yicm
Copy link

yicm commented Aug 6, 2020

Hi, @jayconrod @hlopko The platform functions of rules_go cannot be unified with those of other languages. For example, I configured the C++ platform tool chain and customized the platform constraints, while rules_go fixed the constraints of the platform functions. For a mixed project of C++ and Go, I need to divide two compilations, is there a way to use the following commands to achieve one-time compilation?

bazel build --platforms=//platforms:custom_platform  //...

@asv
Copy link

asv commented Aug 6, 2020

We also face a similar problem (like @yicm) in a mixed Java project:

$  bazelisk build //... --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64_cgo 
2020/08/05 17:43:05 Downloading https://releases.bazel.build/3.4.1/release/bazel-3.4.1-darwin-x86_64...
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: While resolving toolchains for target //java/svc2/src/main/java/com/example/svc2:svc2: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
ERROR: Analysis of target '//java/svc2/src/main/java/com/example/svc2:svc2' failed; build aborted: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
INFO: Elapsed time: 52.756s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (14 packages loaded, 110 targets configure

I can create a separate issue along with the repro repository if needed.

Edit: problem occurs only under Mac OS. On linux we have no problem with cross compilation.

@yicm
Copy link

yicm commented Aug 6, 2020

Hi, @asv My error message is the same as yours. I think there is a need to create a new issue. I have not succeeded under the Linux platform. Maybe we can show this problem better through a bazel demo project.

@jayconrod
Copy link
Contributor

Closing this issue because it's been fixed for a long time. At the time, rules_go had an unconditional dependency on the C/C++ toolchain in configurations where it wasn't needed and wasn't available.

@asv @yicm Could you open new issues and fill out the issue template? I don't think the issues you're commenting on are related to the problems you're seeing, but I don't have enough information to help more than that.

alexeagle pushed a commit to alexeagle/rules_docker that referenced this issue Mar 12, 2021
It was fixed over a year ago according to bazel-contrib/rules_go#2089

I've had some users confused by this message since we tell them to use this rule on Mac
alexeagle added a commit to bazelbuild/rules_docker that referenced this issue Mar 12, 2021
It was fixed over a year ago according to bazel-contrib/rules_go#2089

I've had some users confused by this message since we tell them to use this rule on Mac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests