Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extra_targets, user configurable toolchains, and wasm support #85

Conversation

rrbutani
Copy link
Collaborator

@rrbutani rrbutani commented Aug 16, 2021

(this PR builds on #75; that PR should be reviewed/merged before this one)

This PR is not ready for merging yet but I thought I'd open the PR anyways to get some feedback and see if this is a use case the maintainers are interested in.


This PR adds support for extra_toolchains, an attribute on configure that lets users register additional toolchains for different target triples:

load("@com_grail_bazel_toolchain//toolchain:rules.bzl", "llvm_toolchain")
llvm_toolchain(
    name = "llvm_toolchain",
    llvm_version = "8.0.0",
    extra_targets = [
      "wasm32-unknown-wasi",
    ],

    # Extra targets can have their sysroots overriden too:
    sysroot = {
        "linux": "@some_example_sysroot_repo//:linux_sysroot",
        "darwin": "@some_example_sysroot_repo//:macos_sysroot",

        "linux_wasm32-unknown-wasi": "@some_example_sysroot_repo//:wasi_sysroot",
        "darwin_wasm32-unknown-wasi": "@some_example_sysroot_repo//:wasi_sysroot",
    },
)

The toolchains registered with extra_targets have the appropriate constraint values so that Bazel will use them as necessary through toolchain resolution.


In extending the internals of this repo to support extra_targets, cc_toolchain_config and other internal macros were made a fair bit more general. This PR also tweaks these macros to be usable externally, allowing users to make small modifications to the configured toolchains without needing to modify this repo:

# WORKSPACE
# (parts to set up `@llvm_toolchain` have been elided; see above)

llvm_toolchain(
    name = "llvm_toolchain",
    llvm_version = "8.0.0",

    # NOTE: This is required to set up toolchains outside of `@llvm_toolchain`, unfortunately
    absolute_paths = True,
 )

# This registers the default toolchains.
load("@llvm_toolchain//:toolchains.bzl", "llvm_register_toolchains", "register_toolchain")

llvm_register_toolchains()

# Now let's make our own:
http_archive(
    name = "thumbv7-sysroot",
    urls = ["example.com"],
)
register_toolchain("//tests:custom_toolchain_example")

# BUILD file:
# Example Custom Toolchain:
load("@llvm_toolchain//:cc_toolchain_config.bzl", "cc_toolchain_config")

# Docs for this function and `overrides` are in `cc_toolchain_config.bzl.tpl`.
cc_toolchain_config(
    name = "custom_toolchain_example_config",
    host_platform = "linux",
    custom_target_triple = "thumbv7em-unknown-none-gnueabihf",
    overrides = {
        "target_system_name": "thumbv7em-unknown-none-gnueabihf",
        "target_cpu": "thumbv7em",
        "target_libc": "unknown",
        "abi_libc_version": "unknown",

        # If you omit this, be sure to depend on
        # `@llvm_toolchain:host_sysroot_components`.
        # "sysroot_path": "external/thumbv7-sysroot/sysroot",

        "extra_compile_flags": [
            "-mthumb",
            "-mcpu=cortex-m4",
            "-mfpu=fpv4-sp-d16",
            "-mfloat-abi=hard",
        ],
        "omit_hosted_linker_flags": True,
        "omit_cxx_stdlib_flag": False,
        "use_llvm_ar_instead_of_libtool_on_macos": True,
    }
)

load("@com_grail_bazel_toolchain//toolchain:rules.bzl", "conditional_cc_toolchain")
conditional_cc_toolchain(
    name = "custom_toolchain",
    toolchain_config = ":custom_toolchain_example_config",
    host_is_darwin = False,

    sysroot_label = "@llvm_toolchain//:host_sysroot_components", # use this if not overriding
    # sysroot_label = "@thumbv7-sysroot//:sysroot", # override

    absolute_paths = True, # this is required for toolchains set up outside of `@llvm_toolchain`, unfortunately
    llvm_repo_label_prefix = "@llvm_toolchain//",
)

# Constraints come from here: https://github.com/bazelbuild/platforms
toolchain(
    name = "custom_toolchain_example",
    exec_compatible_with = [
        "@platforms//cpu:x86_64",
        "@platforms//os:linux",
    ],
    target_compatible_with = [
        "@platforms//cpu:armv7", # `v7e-mf` has not yet made it to stable Bazel?
        # "@platforms//os:none",
    ],
    toolchain = ":custom_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)

The interface is currently extremely rough and is just intended to be a way to prototype support for other targets without needing to modify this repo. If there's interest, there's lots that can be done to turn this into a user-friendly API.


Currently, only the wasm32-unknown-wasi target triple is tested and actually working. However this PR lays the foundation for support for other targets; it should just be a matter of adding sysroot and compiler_rt locations for other targets and fine-tuning the LLVM target triple to Bazel constraint value mappings.


As an aside, even in adding support for just a single additional (not-hosted) target the ways in which using unix_cc_toolchain_config.bzl (#75) limit us becomes apparent (we would like to have the toolchain for wasm targets not support the dynamic linking actions since they just error; instead we're forced to use --features=-supports_dynamic_linker on the command line or transitions/selects or other such workarounds).

So far this has been manageable but this definitely casts some doubt on whether #75 is the way to go.

Another thing to consider is whether upstream Bazel would be willing to make unix_cc_toolchain_config.bzl slightly more general/whether or not our use case aligns with its intended purpose.

…_tools

(See bazel-contrib#23).

Since `@bazel_tools//tools/cpp:unix_cc_toolchain_config.bzl:cc_toolchain_config` is a rule, our `cc_toolchain_config`
needs to become a macro instead of a rule. Luckily we actually can do this without introducing any breaking changes.

Other than the discrepancies listed within (the things marked with `## NOTE:` — make vars and framework paths)
everything seemed to just fall into place. So far it seems to _just work_ 🤞.
It's true that modern versions of ld.lld support start and end groups but we don't use ld.lld on macOS (yet).
…tions in the execroot

As the message within explains, `rules_foreign_cc` will use a different
PWD than the execroot root for cmake targets which breaks the wrapper
script. This commit has the wrapper script search alongside itself for
`clang` when it isn't in the usual places.
Bazel complains about this; see: https://stackoverflow.com/questions/62451307/specify-sysroot-for-bazel-toolchain

for Linux the default default sysroot is "" so we hit this error
…chains for extra targets

doesn't actually change the sysroot/`--target` flag, etc. as needed yet; just wires things up on the bazel side

lightly tested
this will let us avoid grabbing/loading into the sandbox all the sysroots when using
any of the toolchains
…n `cc_toolchain_config`

the intention is that this can be used both by `configure.bzl` in processing `extra_targets`
and by users of this repo who wish to set up their own toolchains that are slightly different
from the provided ones *without* needed for patch this repo or recreate all the logic in it

this will also enable people to prototype setting up toolchains for other targets quickly
(which they can then easily make a PR for to add support for using the target via
`extra_targets`)
…ch the compiler files live

this is needed to allow users to create their own toolchains using our macros

because rules.bzl is *not* autogenerated with `configure`, we have to explicitly pass in the
repo where the compiler and friends live

eventually – if this is an API we want to actually support – we should provide some wrapper
functions that land in the generated repo and call the underlying functions for you and
paper over some of the sharper edges

an example usage is in the next commit
we want to get the current version of `@platforms`, not just the version bundled with the current Bazel version being used
this way we don't have to adjust the constraints produced for a particular toolchain to match the version of `@platforms` used by the current Bazel version
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from 563ec76 to a0f0eb4 Compare August 17, 2021 01:05
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from 6c86168 to a89ddef Compare September 13, 2021 03:21
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from 0c894e7 to 94a4975 Compare September 13, 2021 05:25
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from 81edc53 to 8cb5c40 Compare September 13, 2021 06:04
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from 8cb5c40 to 99ed2f7 Compare September 13, 2021 06:11
@rrbutani rrbutani force-pushed the feature/extra-targets-and-wasm-support branch from d5e622c to 55211fe Compare September 13, 2021 07:08
README.md Outdated Show resolved Hide resolved
@dzbarsky
Copy link
Contributor

@rrbutani Any chance you are still interested in pursuing this? wasm support would be really nice

@rrbutani
Copy link
Collaborator Author

@dzbarsky I'm still interested but I think the approach I took was pretty general; I'm not sure if it aligns with the goals of this project.

Are you interested in just wasm support or also in the extra targets/configuration bits as well?

@dzbarsky
Copy link
Contributor

I suspect a lot of users end up applying small patches to twiddle some of the options that you expose as extra configuration, so I'm not sure if its worth the complexity. For me the main benefit would be to have a way to target wasm using a platform_transition_filegroup or similar in usercode and have the cc toolchain resolution work out correctly; I'm not sure if that requires extra_targets.

@rrbutani
Copy link
Collaborator Author

For me the main benefit would be to have a way to target wasm using a platform_transition_filegroup or similar in usercode and have the cc toolchain resolution work out correctly; I'm not sure if that requires extra_targets.

I won't speak for the active maintainers of this project but I'd guess that unconditionally registering wasm/wasi toolchains (and fetching the wasi-sdk sysroot for the latter) probably isn't too controversial; there's probably no need to grow configuration options like extra_targets for this use case.

Unfortunately I won't have the cycles to work on something like this for at least the next few weeks; I'm happy to answer questions though.

@siddharthab
Copy link
Contributor

My thinking was that existing users can go through https://github.com/grailbio/bazel-toolchain#supporting-new-target-platforms and submit PRs for their target platforms, including WASM. I am a little bit reluctant to also manage the mapping between LLVM versions and WASM SDK in this project, so maybe it can be left to the users to configure the right sysroot. Maybe just adding a test for WASM here can serve as an example.

@siddharthab
Copy link
Contributor

If this is not an active topic anymore, should we close it?

@rrbutani rrbutani closed this Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants