Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustup linux rustc executable not executable (symbol lookup error) #99440

Closed
matthiaskrgr opened this issue Jul 18, 2022 · 18 comments · Fixed by #99442 or #99680
Closed

rustup linux rustc executable not executable (symbol lookup error) #99440

matthiaskrgr opened this issue Jul 18, 2022 · 18 comments · Fixed by #99442 or #99680
Labels
C-bug Category: This is a bug.

Comments

@matthiaskrgr
Copy link
Member

matthiaskrgr commented Jul 18, 2022

since 246f66a / #99062 the linux rustc executables distributed by rustup are no longer executable:

rustup-toolchain-installer-master 246f66a905c2815f2c9b9c3d6b1e0649f3360ef8
~/.rustup/toolchains/246f66a905c2815f2c9b9c3d6b1e0649f3360ef8/bin/rustc  --version
/home/matthias/.rustup/toolchains/246f66a905c2815f2c9b9c3d6b1e0649f3360ef8/bin/rustc: symbol lookup error: /home/matthias/.rustup/toolchains/246f66a905c2815f2c9b9c3d6b1e0649f3360ef8/bin/rustc: undefined symbol: _ZN3std2rt19lang_start_internal17h614c83dc44962c80E

The toolchain before that merge (263edd4 ) works fine for me.

My target (is that the right term for it?) is x86_64-unknown-linux-gnu

@matthiaskrgr matthiaskrgr added the C-bug Category: This is a bug. label Jul 18, 2022
@Mark-Simulacrum
Copy link
Member

I think it would be good to get the dynamic linker version, since perf (which also uses Linux) didn't run into this...

@workingjubilee
Copy link
Member

@matthiaskrgr If this is your "default" nightly toolchain that you tried to upgrade and it borked, then it is considered the host toolchain by rustup.

@matthiaskrgr
Copy link
Member Author

@workingjubilee this toolchain was installed via rustup-toolchain-install-master -f -n master -c rustc-dev llvm-tools rust-src clippy rust-analyzer rustfmt to be precise this is (not yet..) the master toolchain, but will be in a few hours.

Executing different rustcs by path directly like ~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc has always worked regardless of rustup until now.

@jyn514
Copy link
Member

jyn514 commented Jul 18, 2022

cc @Kobzol

@jyn514
Copy link
Member

jyn514 commented Jul 18, 2022

I think we should revert that PR until we figure out why this is broken.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 18, 2022

Heh, that was fast :D I expected obscure errors, but outright failing to execute rustc... well, that's bad.

Filed #99442

@workingjubilee
Copy link
Member

So the obvious question is:
How was this not caught in CI?

@jyn514
Copy link
Member

jyn514 commented Jul 19, 2022

@workingjubilee it seems to not happen on every platform, CI obviously ran the tests successfully and @cuviper tested in a bunch of platforms with old glibc versions / toolchains and it worked on those too.

@workingjubilee
Copy link
Member

@matthiaskrgr May we see uname -srm && echo "---" && ld --version && echo "---" && ldd --version?

@Mark-Simulacrum
Copy link
Member

Need to finish investigation, but hopefully this is now not affecting anyone.

@matthiaskrgr
Copy link
Member Author

Linux 5.18.10-1-MANJARO x86_64
---
GNU ld (GNU Binutils) 2.38
Copyright (C) 2022 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
---
ldd (GNU libc) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

@cuviper
Copy link
Member

cuviper commented Jul 19, 2022

I'm not running that kernel, but rustc 246f66a905c2815f2c9b9c3d6b1e0649f3360ef8 runs fine for me in the manjarolinux/base container, using podman on a Fedora 36 host.

@eddyb
Copy link
Member

eddyb commented Jul 23, 2022

--set rust.use-lld=true was being added at the same time as passing a flag with a value of all when there apparently is also a value of safe (which in this context could maybe mean "safe to use with any dynamic linker because it behaves as if LLVM mergefunc was used originally" or something else like that).

IMO we should reland --set rust.use-lld=true right away (possibly gated on the nightly release channel so we don't accidentally end up with obscure bugs leaking into stable) and have a few weeks of waiting period to let any reports trickle in about LLD breaking our linking.

And only then we could start adding flags (we just need "the next nightly" to be scientific, but still, if LLD
does have some weird bugs that haven't been seen before but rustc_driver does hit them, we need to know).

I would also recommend landing -Clink-args=-Wl,--icf=safe before -Clink-args=-Wl,--icf=all, just so we have a better chance at finding the broken link in the LLD -> ICF(safe) -> ICF(all) chain.

@Mark-Simulacrum
Copy link
Member

LLD was not landed standalone because it's a regression for us to do so (binary size increases). I suppose we could temporarily land such a change, but it seems like we still are failing to reproduce the bug here, so I'm hesitant to say we'd be able to verify that LLD alone (or safe, or any other options) still cause the problem or not. Maybe we can rely on @matthiaskrgr to test, but that seems horrible (and could be transient, nondeterministic, etc.)

@Kobzol
Copy link
Contributor

Kobzol commented Jul 23, 2022

I already did tests with safe ICF and LLD only. Both fail for @matthiaskrgr . So even with just using LLD, it's broken in some environments.

@eddyb
Copy link
Member

eddyb commented Jul 23, 2022

Both fail for @matthiaskrgr . So even with just using LLD, it's broken in some environments.

Ah, thanks for testing, I wasn't been able to gather that detail from either the Zulip thread or this thread. That's the kind of situation I was imagining switching only one aspect at a time would find.


The only thing I can think of for diagnosing LLD is doing something like this:

nm -S toolchain/lib/librustc_driver-*.so \
  | sed -E 's/^[0-9a-f]* //;s/^ *([a-zA-Z] )/---------------- \1/' \
  | sort -k 3 \
  > rustc_driver.syms

for 2 toolchains, using gold vs lld for linking, but with the rest of the build unaffected
(i.e. ideally only the librustc_driver-*.so should be rebuilt between the two cases)

And then diffing the two outputs. The order might differ, but ideally the sort is enough, and the symbol sizes shouldn't differ unless there's weird linker relaxation stuff going on.

(I'm assuming we're using ld.gold, but if @matthiaskrgr saw breakage going from ld.bfd to LLD, it might be worth trying ld.gold in case it can replicate the issue)


Another experiment that might be interesting is trying to integrate mold, if it doesn't break in some more obvious reason - it's probably not ready for nightly, but it could shed some light.

@eddyb
Copy link
Member

eddyb commented Jul 24, 2022

Well, ignore all that, this one was user error: @matthiaskrgr had LD_LIBRARY_PATH set, causing this behavior:

@eddyb:

$ ldd ~/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/librustc_driver-*.so
ldd: warning: you do not have execution permission for `/home/eddy/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so'
        linux-vdso.so.1 (0x00007ffe512b5000)
        libstd-1aab30bac2090a15.so => /home/eddy/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/../lib/libstd-1aab30bac2090a15.so (0x00007fe700d83000)
        libLLVM-14-rust-1.64.0-nightly.so => /home/eddy/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/../lib/libLLVM-14-rust-1.64.0-nightly.so (0x00007fe6fb7fb000)

@matthiaskrgr:

$ ldd ~/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/librustc_driver-*.so
ldd: warning: you do not have execution permission for `/home/matthias/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so'
    linux-vdso.so.1 (0x00007ffcae74c000)
    libstd-1aab30bac2090a15.so => /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/libstd-1aab30bac2090a15.so (0x00007f3bac800000)
    libLLVM-14-rust-1.64.0-nightly.so => /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/libLLVM-14-rust-1.64.0-nightly.so (0x00007f3ba75d2000)

(system libs omitted for brevity)

Note how in @matthiaskrgr's case, instead of doing the relative ../lib traversal specified in the rpath of librustc_driver-*.so, it actually uses the fully absolute path to the nightly toolchain.

That happened because @matthiaskrgr had this value for LD_LIBRARY_PATH:

:/home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib:/home/matthias/.rustup/toolchains/master/lib/

We can probably reland the original PR (or just turning LLD on if we want to) and see if anything else breaks.

@eddyb
Copy link
Member

eddyb commented Jul 24, 2022

Aha! And here's why @matthiaskrgr's setup only broke now:

$ ls ~/.rustup/toolchains/nightly-2022-07-{16,17,18,19,20,21,22,23,24}-x86_64-unknown-linux-gnu/lib/librustc_driver-*.so | cat
/home/eddy/.rustup/toolchains/nightly-2022-07-16-x86_64-unknown-linux-gnu/lib/librustc_driver-13073596e99db423.so
/home/eddy/.rustup/toolchains/nightly-2022-07-17-x86_64-unknown-linux-gnu/lib/librustc_driver-aa112a58a2cb5756.so
/home/eddy/.rustup/toolchains/nightly-2022-07-18-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so
/home/eddy/.rustup/toolchains/nightly-2022-07-19-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so
/home/eddy/.rustup/toolchains/nightly-2022-07-20-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so
/home/eddy/.rustup/toolchains/nightly-2022-07-21-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so
/home/eddy/.rustup/toolchains/nightly-2022-07-22-x86_64-unknown-linux-gnu/lib/librustc_driver-47fae6bd857f5148.so
/home/eddy/.rustup/toolchains/nightly-2022-07-23-x86_64-unknown-linux-gnu/lib/librustc_driver-6d0404a4e3a2c33a.so
/home/eddy/.rustup/toolchains/nightly-2022-07-24-x86_64-unknown-linux-gnu/lib/librustc_driver-4ecd731fcabc90db.so

Unrelated to the LLD change, in the same nightly, we stopped changing rustc_driver's file hash - not sure if this is intentional or coincidental.
I believe @matthiaskrgr at the time had nightly-2022-07-18 as the nightly toolchain, and so librustc_driver-47fae6bd857f5148.so was being used from there, instead of from the toolchain being executed.

EDIT: added a few more nightlies, looks like Cargo.lock changes might be what's causing the file hash to change?
Whereas libstd-*.so always has the same file hash because we force it to:

rust/src/bootstrap/builder.rs

Lines 1564 to 1602 in c32dcbb

// FIXME: Temporary fix for https://github.com/rust-lang/cargo/issues/3005
// Force cargo to output binaries with disambiguating hashes in the name
let mut metadata = if compiler.stage == 0 {
// Treat stage0 like a special channel, whether it's a normal prior-
// release rustc or a local rebuild with the same version, so we
// never mix these libraries by accident.
"bootstrap".to_string()
} else {
self.config.channel.to_string()
};
// We want to make sure that none of the dependencies between
// std/test/rustc unify with one another. This is done for weird linkage
// reasons but the gist of the problem is that if librustc, libtest, and
// libstd all depend on libc from crates.io (which they actually do) we
// want to make sure they all get distinct versions. Things get really
// weird if we try to unify all these dependencies right now, namely
// around how many times the library is linked in dynamic libraries and
// such. If rustc were a static executable or if we didn't ship dylibs
// this wouldn't be a problem, but we do, so it is. This is in general
// just here to make sure things build right. If you can remove this and
// things still build right, please do!
match mode {
Mode::Std => metadata.push_str("std"),
// When we're building rustc tools, they're built with a search path
// that contains things built during the rustc build. For example,
// bitflags is built during the rustc build, and is a dependency of
// rustdoc as well. We're building rustdoc in a different target
// directory, though, which means that Cargo will rebuild the
// dependency. When we go on to build rustdoc, we'll look for
// bitflags, and find two different copies: one built during the
// rustc step and one that we just built. This isn't always a
// problem, somehow -- not really clear why -- but we know that this
// fixes things.
Mode::ToolRustc => metadata.push_str("tool-rustc"),
// Same for codegen backends.
Mode::Codegen => metadata.push_str("codegen"),
_ => {}
}
cargo.env("__CARGO_DEFAULT_LIB_METADATA", &metadata);

@bors bors closed this as completed in c9b3183 Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
7 participants