Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integration-test: Set rust-lld as a linker only on macOS #908

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vadorovsky
Copy link
Member

@vadorovsky vadorovsky commented Mar 13, 2024

The recommendation (coming from rust-lang/rust#130062) for Linux hosts
is using C compiler driver as a linker, which is able to find
system-wide libraries. Using linker binaries directly in -C linker
(e.g. -C linker=rust-lld) often results in errors like:

cargo:warning=error: linking with `rust-lld` failed: exit status: 1ger, ppv-lite86, libc...
cargo:warning=  |
cargo:warning=  = note: LC_ALL="C" PATH="/home/vadorovsky/.rustup/toolchains/stable-x86_64-un
cargo:warning=  = note: rust-lld: error: unable to find library -lgcc_s
cargo:warning=          rust-lld: error: unable to find library -lc
cargo:warning=
cargo:warning=
cargo:warning=
cargo:warning=error: aborting due to 1 previous error

Not touching the linker settings is usually the best approach for Linux
systems. Native builds pick up the default C toolchain. Cross builds
default to GCC cross wrapper, but that's easy to supress with clang
and lld using RUSTFLAGS.

However, -C linker=rust-lld still works the best on macOS, where
Rust toolchains come with unwinder and runtime and there is usually no
need to link system libraries. Keep setting it only for macOS.

Fixes #907


This change is Reviewable

Copy link

netlify bot commented Mar 13, 2024

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 9e1d8b4
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/67471c609f18100008ce5273
😎 Deploy Preview https://deploy-preview-908--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the test A PR that improves test cases or CI label Mar 13, 2024
@vadorovsky vadorovsky changed the title integration-test: Prefer system-wide lld over rust-lld integration-test: Prefer system-wide lld over rust-lld on Linux Sep 7, 2024
@vadorovsky vadorovsky force-pushed the integration-tests-linker branch 3 times, most recently from c3a1ecf to 0107a99 Compare September 7, 2024 18:13
@tamird
Copy link
Member

tamird commented Oct 7, 2024

Is this no longer necessary?

@vadorovsky
Copy link
Member Author

vadorovsky commented Oct 8, 2024

Is this no longer necessary?

I would say it's still necessary. I'm still facing the same issue when I'm building integration tests on any musl-based system.

I asked Rust community about that here rust-lang/rust#130062 and the conclusion is basically:

"you're supposed to use the C compiler driver as the linker, as it knows where to find these libraries"

It's still a mystery for me why -C linker=rust-lld doesn't face the same issue on Ubuntu and I'm only able to reproduce that on Alpine and Gentoo(musl-llvm) with *-musl toolchains, but their explanation makes sense. I ended working around this issue on my side with custom RUSTFLAGS (-C linker=clang -C link-arg=-fuse-ld=lld) and commenting the chunk of code that this PR touches.

Let me rebase it. The red builds are still the old ones from September, unrelated to the PR itself.

@vadorovsky
Copy link
Member Author

vadorovsky commented Oct 8, 2024

On a side note, Rust nightly is currently using rust-lld by default (but still through the system C compiler which figures out the paths). If they decide to ship that in any stable version, we won't have to touch any RUSTFLAGS here anymore.

@vadorovsky
Copy link
Member Author

vadorovsky commented Oct 8, 2024

Ubuntu fails now because of old clang. I also noticed just now that we are entering the if let Some(target) statement even for native builds, when running integration tests...

Taking a step back, I'm starting to doubt whether we should mangle with the linkers and linker flags at all in this place. Our default cargo configuration specifies the cross gcc wrappers as linkers, which should be good enough for Ubuntu/Debian/Fedora users even for cross-compilation scenarios. People wanting to link with clang+lld could just define their own RUSTFLAGS (at least I'm fine with doing that,). I will give it a try now, but if not touching the flags is going to work both on Ubuntu and macOS, then I think that would be a solution.

@tamird
Copy link
Member

tamird commented Oct 9, 2024

Ubuntu fails now because of old clang. I also noticed just now that we are entering the if let Some(target) statement even for native builds, when running integration tests...

That doesn't sound right. Are you sure?

Taking a step back, I'm starting to doubt whether we should mangle with the linkers and linker flags at all in this place. Our default cargo configuration specifies the cross gcc wrappers as linkers, which should be good enough for Ubuntu/Debian/Fedora users even for cross-compilation scenarios. People wanting to link with clang+lld could just define their own RUSTFLAGS (at least I'm fine with doing that,). I will give it a try now, but if not touching the flags is going to work both on Ubuntu and macOS, then I think that would be a solution.

How is aarch64-linux-musl-gcc going to work for someone running on macOS? I believe I added that for exactly this reason.

@vadorovsky
Copy link
Member Author

vadorovsky commented Oct 9, 2024

Ubuntu fails now because of old clang. I also noticed just now that we are entering the if let Some(target) statement even for native builds, when running integration tests...

That doesn't sound right. Are you sure?

Nevermind, that's not true. cargo xtask integration-test local without any --target flag doesn't enter the statement and works fine on musl distros.

What doesn't work is building/running VM tests (on musl/Linux host). In that case target is set, rust-lld used as a linker and it's struggling with finding the libs.

$ cargo xtask integration-test vm gentoo-kernel-6.6.51-1/image/usr/src/linux-6.6.51-gentoo-dist/arch/x86/boot/bzImage
[...]
cargo:warning=  = note: LC_ALL="C" PATH="/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/bin:/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/bin:/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/bin:/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "rust-lld" "-flavor" "gnu" "--version-script=/tmp/rustcpfJE43/list" "--no-undefined-version" "/tmp/rustcpfJE43/symbols.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.0.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.1.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.2.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.3.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.4.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.5.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.6.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.7.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.rustversion.4145b995296d1182-cgu.8.rcgu.o" "/src/target/debug/deps/rustversion-a51ee28261058709.f01o1e0hk9d1mcr47j33gnmo6.rcgu.rmeta" "/src/target/debug/deps/rustversion-a51ee28261058709.8219qqgkv2oj66wv6vzyrwhri.rcgu.o" "--as-needed" "-L" "/src/target/debug/deps" "-L" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib" "-Bstatic" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libproc_macro-d2aa9b2c4004c068.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libstd-1b60b176e45a0318.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libpanic_unwind-57b45c66d2e62c49.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libobject-0123438d987de009.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libmemchr-05f3c42b49d78619.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libaddr2line-45c700404e36c3de.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libgimli-cf5f9b37b545ed6e.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/librustc_demangle-44abbf672f21a10b.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libstd_detect-9b6d712a985955e4.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libhashbrown-e4ec61de4b8ef447.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/librustc_std_workspace_alloc-c72777eb88217fe3.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libminiz_oxide-b0a8429443e0fed6.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libadler-820055e0e9cc55a4.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libunwind-2ceb129e8b1d5c77.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libcfg_if-8cd2ce184499d012.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/liblibc-ada011f709b467ef.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc-67a0423d44ab989a.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/librustc_std_workspace_core-15fc71c4e52ac90f.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libcore-273f69939f389095.rlib" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-c8f82c72237d07e8.rlib" "-Bdynamic" "-lgcc_s" "-lc" "--eh-frame-hdr" "-z" "noexecstack" "-L" "/usr/local/rustup/toolchains/1.81.0-x86_64-unknown-linux-musl/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/src/target/debug/deps/librustversion-a51ee28261058709.so" "--gc-sections" "-shared" "-soname=librustversion-a51ee28261058709.so" "-z" "relro" "-z" "now"
cargo:warning=  = note: rust-lld: error: unable to find library -lgcc_s
cargo:warning=          rust-lld: error: unable to find library -lc
cargo:warning=
cargo:warning=
cargo:warning=
cargo:warning=error: aborting due to 1 previous errorures-util, hashbrown, libc, clap_builder, winnow, libc, quote, rustversion, r...
cargo:warning=
cargo:warning=
error: could not compile `rustversion` (lib) due to 2 previous errors

After the following change:

diff --git a/xtask/src/run.rs b/xtask/src/run.rs
index 8bba7a5..b6e4b2a 100644
--- a/xtask/src/run.rs
+++ b/xtask/src/run.rs
@@ -59,8 +59,7 @@ where
     let mut cmd = Command::new("cargo");
     cmd.args(["build", "--message-format=json"]);
     if let Some(target) = target {
-        let config = format!("target.{target}.linker = \"rust-lld\"");
-        cmd.args(["--target", target, "--config", &config]);
+        cmd.args(["--target", target]);
     }
     f(&mut cmd);

It works. At least for the x86_64 scenario, where my default C toolchain is picked up. For cross scenarios, I would need to set up RUSTFLAGS manually to pick up clang+lld, but it's fine. I just don't want xtask automatically switching to rust-lld in that case.

I also checked whether that change works on Ubuntu and it also works fine. Using C compiler as linker indeed looks like the best approach for Linux.

How is aarch64-linux-musl-gcc going to work for someone running on macOS? I believe I added that for exactly this reason.

I see. I guess that -C linker=rust-lld works on macOS, because unlike on Linux, rustup ships the unwinder and runtime libraries in together with the toolchain and then the linker doesn't try to find them in the system.

I tried whether the combination of -C linker=clang -C link-args=-fuse-ld=lld works on macOS, but nope, it throws the following:

-contained/crtn.o"
cargo:warning=  = note: ld64.lld: error: unknown argument '--as-needed'
cargo:warning=          ld64.lld: error: unknown argument '-Bstatic', did you mean '-static'
cargo:warning=          ld64.lld: error: unknown argument '-Bdynamic', did you mean '-dynamic'
cargo:warning=          ld64.lld: error: unknown argument '--eh-frame-hdr'
cargo:warning=          ld64.lld: error: unknown argument '-z'
cargo:warning=          ld64.lld: error: unknown argument '--gc-sections'
cargo:warning=          ld64.lld: error: unknown argument '-z'
cargo:warning=          ld64.lld: error: unknown argument '-z'
cargo:warning=          ld64.lld: error: unknown argument '--strip-debug'
cargo:warning=          clang: error: linker command failed with exit code 1 (use -v to see invocation)
cargo:warning=
cargo:warning=
cargo:warning=
cargo:warning=error: aborting due to 1 previous error
cargo:warning=
cargo:warning=
error: could not compile `init` (bin "init") due to 2 previous errors
Error: building init program failed
Caused by:
    "cargo" "build" "--message-format=json" "--target" "aarch64-unknown-linux-musl" "--target" "aarch64-unknown-linux-musl" "--config" "target.aarch64-unknown-linux-musl.linker = \"clang\"" "--config" "target.aarch64-unknown-linux-musl.rustflags = \"-C link-arg=-fuse-ld=lld\"" "--package" "init" "--profile" "release" failed: ExitStatus(unix_wait_status(25856))

So I guess that we could apply -C linker=rust-lld only on macOS host - hacky, but I think that will make everyone happy.

@vadorovsky vadorovsky force-pushed the integration-tests-linker branch 2 times, most recently from ce23f2e to 52045f7 Compare October 9, 2024 16:54
@vadorovsky vadorovsky changed the title integration-test: Prefer system-wide lld over rust-lld on Linux integration-test: Set rust-lld as a linker only on macOS Oct 9, 2024
@tamird
Copy link
Member

tamird commented Oct 9, 2024

Thanks for the detail. Let's give it a shot. Can we add a CI job that verifies that everything works as expected?

@vadorovsky vadorovsky force-pushed the integration-tests-linker branch 5 times, most recently from 8175be3 to ae94156 Compare October 12, 2024 16:01
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 2 of 2 files at r3, all commit messages.
Reviewable status: 1 of 2 files reviewed, 3 unresolved discussions (waiting on @vadorovsky)


.github/workflows/ci.yml line 192 at r2 (raw file):

    steps:
      - name: Log in to GHCR
        run: echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin

is there a reason we can't use github actions' "native" syntax for docker actions? https://docs.github.com/en/actions/sharing-automations/creating-actions/creating-a-docker-container-action#creating-an-action-metadata-file


.github/workflows/ci.yml line 274 at r2 (raw file):

      - name: Install prerequisites
        if: runner.os == 'Linux' && contains(matrix.container, 'alpine')
        # Use clang for building the C eBPF programs for integration tests.

can we document each dep please as we do above?


xtask/src/run.rs line 63 at r1 (raw file):

        cmd.args(["--target", target]);
        // Always use rust-lld on macOS hosts. See
        // https://github.com/aya-rs/aya/pull/908#issuecomment-2402813711

let's not do this. please write a sensible explanation here; don't make me go to a (mutable) comment in github

The recommendation (coming from rust-lang/rust#130062) for Linux hosts
is using C compiler driver as a linker, which is able to find
system-wide libraries. Using linker binaries directly in `-C linker`
(e.g. `-C linker=rust-lld`) often results in errors like:

```
cargo:warning=error: linking with `rust-lld` failed: exit status: 1ger, ppv-lite86, libc...
cargo:warning=  |
cargo:warning=  = note: LC_ALL="C" PATH="/home/vadorovsky/.rustup/toolchains/stable-x86_64-un
cargo:warning=  = note: rust-lld: error: unable to find library -lgcc_s
cargo:warning=          rust-lld: error: unable to find library -lc
cargo:warning=
cargo:warning=
cargo:warning=
cargo:warning=error: aborting due to 1 previous error
```

Not touching the linker settings is usually the best approach for Linux
systems. Native builds pick up the default C toolchain. Cross builds
default to GCC cross wrapper, but that's easy to supress with clang and
lld using RUSTFLAGS.

However, `-C linker=rust-lld` still works the best on macOS, where Rust
toolchains come with libc and runtime library and there is no need to
link any system libraries. Keep setting it only for macOS.

Fixes aya-rs#907
@vadorovsky vadorovsky force-pushed the integration-tests-linker branch 2 times, most recently from ec3ab30 to c37355e Compare November 26, 2024 14:37
@vadorovsky vadorovsky marked this pull request as draft November 26, 2024 16:01
@vadorovsky vadorovsky force-pushed the integration-tests-linker branch 18 times, most recently from 26f5b24 to ae8eece Compare November 27, 2024 11:14
Copy link
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @tamird)


.github/workflows/ci.yml line 192 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

is there a reason we can't use github actions' "native" syntax for docker actions? https://docs.github.com/en/actions/sharing-automations/creating-actions/creating-a-docker-container-action#creating-an-action-metadata-file

I removed the custom image all together. I left a comment, but tl;dr - runners don't support running containers with regular users, because of host mounts owned by root.


.github/workflows/ci.yml line 274 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we document each dep please as we do above?

done


xtask/src/run.rs line 63 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

let's not do this. please write a sensible explanation here; don't make me go to a (mutable) comment in github

fair, what about now?

It's hard to predict what's the PID of the first process in a container.
Use this assertion only on non-containerized systems.
This way we are making sure that the integration tests infra doesn't
regress on musl environments.
@vadorovsky vadorovsky marked this pull request as ready for review November 27, 2024 13:36
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, 2 of 2 files at r8, 1 of 1 files at r9, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @vadorovsky)


xtask/src/run.rs line 76 at r5 (raw file):

        // linker (like ldd or mold) can be done with `-C link-arg=-fuse-ld=`.
        //
        // However, the same doesn't hold true for macOS. Cross toolchains for

i'm not i understand how this works. if we were compiling for linux-gnu instead of linux-musl i think we'd still have a problem, no?

in other words i think this is a property of musl vs gcc rather than the host being mac vs linux, but i haven't tested it. how sure are you?


.github/workflows/ci.yml line 209 at r9 (raw file):

            container:
              image: docker.io/alpine:3.20
              options: --privileged -v /sys/fs/bpf:/sys/fs/bpf -v /sys/kernel:/sys/kernel

can you use the volumes attribute?

      volumes:
        - my_docker_volume:/volume_mount

https://docs.github.com/en/actions/writing-workflows/choosing-where-your-workflow-runs/running-jobs-in-a-container


.github/workflows/ci.yml line 230 at r9 (raw file):

        # git is needed for the `checkout` action.
        #
        # libstdc++ is a dependency of llvm-sys, which is a dependency of

nit: libstdc++-dev, no?


.github/workflows/ci.yml line 285 at r9 (raw file):

      # fatal: detected dubious ownership in repository at '/__w/aya/aya'
      #
      # Which makes a lot of sense, not running regular git commands as root is

i think this isn't about running git as root, it's about git noticing that you're in a repo that isn't owned by you


.github/workflows/ci.yml line 287 at r9 (raw file):

      # Which makes a lot of sense, not running regular git commands as root is
      # a good thing. However, using a container image with a regular user
      # results in permission errors thrown by the runner binary.[0] It's most

this should be on the --privileged option way up on the container settings

could we just not run the BPF iterator tests if we notice we're in a container? That would also allow you to ditch the previous commit but the big win is not running as root


.github/workflows/ci.yml line 295 at r9 (raw file):

      - name: Mark the directory as safe for git
        if: matrix.container != ''
        run: git config --global --add safe.directory /__w/aya/aya

can we avoid using __w? isn't there a github root thing we can read?


test/integration-test/src/tests/iter.rs line 23 at r8 (raw file):

    assert_eq!(line_title, "tgid     pid      name");
    // It's hard to predict what's the PID of the first process in a container.

can we make the assertion agnostic to the pid number but still check it? e.g. we could split each line on whitespace and assert we find init or systemd?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

integration-test: rust-lld invoked by cargo xtask integration-test vm fails to find libraries
2 participants