Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__asan_globals_registered is not comdat when building a staticlib with LTO #113404

Closed
glandium opened this issue Jul 6, 2023 · 32 comments · Fixed by #114946
Closed

__asan_globals_registered is not comdat when building a staticlib with LTO #113404

glandium opened this issue Jul 6, 2023 · 32 comments · Fixed by #114946
Labels
A-linkage Area: linking into static, shared libraries and binaries A-sanitizers Area: Sanitizers for correctness and code quality. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glandium
Copy link
Contributor

glandium commented Jul 6, 2023

Disclaimer: I tried to create a testcase from scratch, but for some reason I wasn't able to find a way to trigger the use of __asan_register_elf_globals instead of __asan_globals_register.

STR:

  • Clone https://github.com/glandium/nss-builtins
  • cd nss-builtins
  • RUSTFLAGS="-Zsanitizer=address" CARGO_PROFILE_RELEASE_LTO=true cargo +nightly build --release
  • objdump -t target/release/libbuiltins_static.a | grep asan_globals_registered

Actual output:

0000000000000000 l     O .bss.___asan_globals_registered	0000000000000008 ___asan_globals_registered
0000000000000000 l    d  .bss.___asan_globals_registered	0000000000000000 .bss.___asan_globals_registered

Expected output:
Something like:

0000000000000008       O *COM*	0000000000000008 .hidden ___asan_globals_registered

This doesn't happen without LTO.
The unfortunate consequence is that when the resulting static library is linked with C or C++ code compiled with clang with -fsanitize=address -fsanitize-address-globals-dead-stripping (that latter flag is now default in clang trunk), which also uses __asan_register_elf_globals/__asan_globals_registered, ODR violation detection kick in complaining about globals defined multiple times, because both the clang-side asan constructor and the rust asan constructor register all the globals. Normally, what happens is that they both use the same __asan_globals_registered (thus it normally being *COM*), and set its value, so that only one constructor registers the globals. With the LTOed staticlib, what happens is that there are two distinct __asan_globals_registered, so both constructors go through.

rustc +nightly --version --verbose:

rustc 1.72.0-nightly (d9c13cd45 2023-07-05)
binary: rustc
commit-hash: d9c13cd4531649c2028a8384cb4d4e54f985380e
commit-date: 2023-07-05
host: x86_64-unknown-linux-gnu
release: 1.72.0-nightly
LLVM version: 16.0.5

(Edit: fixed typos, changed the peculiar setup with -Clto and -Cembed-bitcode=yes to the more normal LTO, which shows the problem too)

@glandium glandium added the C-bug Category: This is a bug. label Jul 6, 2023
@danakj
Copy link
Contributor

danakj commented Jul 12, 2023

https://bugs.chromium.org/p/chromium/issues/detail?id=1459233 is tracking this for Chromium as it causes our asan bots to fail.

@rnk
Copy link

rnk commented Jul 12, 2023

Here is the code which creates the __asan_globals_registered global:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L2216

It's interesting that it uses common linkage (*COM* as you said). Common linkage should work, but to me it is a surprising choice. I would've expected this global to use linkonce_odr linkage and be marked comdat, which is what you would get for a C++17 inline global.

I don't know how Rust is producing LTOed static libraries, but maybe somewhere along the way there is a bug in how LTO is handling common linkage globals.

One possible summary of this issue is that the ODR violation detector is suffering from an ODR violation, we have two flags when we should have one.

@glandium
Copy link
Contributor Author

One possible summary of this issue is that the ODR violation detector is suffering from an ODR violation, we have two flags when we should have one.

I like this take :)

@MaskRay
Copy link
Contributor

MaskRay commented Jul 12, 2023

I consider myself quite familiar with asan, LTO, Clang Driver, but I know very little about Rust.

rm -r target/release/libbuiltins_static.a
RUSTFLAGS="-Zsanitizer=address" CARGO_PROFILE_RELEASE_LTO=true cargo +nightly build --release --verbose

gives me this command (after removing --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat):

rustc --crate-name builtins_static --edition=2021 src/lib.rs --diagnostic-width=159 --crate-type staticlib --emit=dep-info,link -C opt-level=3 -C lto -C metadata=9ef9aaa79a14308e -C extra-filename=-9ef9aaa79a14308e --out-dir /tmp/p/nss-builtins/target/release/deps -L dependency=/tmp/p/nss-builtins/target/release/deps --extern pkcs11_bindings=/tmp/p/nss-builtins/target/release/deps/libpkcs11_bindings-cb6ed583361f31fa.rlib --extern smallvec=/tmp/p/nss-builtins/target/release/deps/libsmallvec-e6a97ae5d501d626.rlib -Zsanitizer=address
% ar t /tmp/p/nss-builtins/target/release/deps/libbuiltins_static-9ef9aaa79a14308e.a | wc -l
177
% ar x /tmp/p/nss-builtins/target/release/deps/libbuiltins_static-9ef9aaa79a14308e.a builtins_static-9ef9aaa79a14308e.builtins_static.7af90105-cgu.0.rcgu.o
% readelf -Ws builtins_static-9ef9aaa79a14308e.builtins_static.7af90105-cgu.0.rcgu.o | grep ___asan_globals_registered
   117: 0000000000000000     8 OBJECT  LOCAL  DEFAULT 3732 ___asan_globals_registered
  2944: 0000000000000000     0 SECTION LOCAL  DEFAULT 3732 .bss.___asan_globals_registered
% llvm-nm -gU builtins_static-9ef9aaa79a14308e.builtins_static.7af90105-cgu.0.rcgu.o
0000000000000000 T BUILTINSC_GetFunctionList
0000000000000000 V DW.ref.rust_eh_personality
0000000000000000 D _ZN3std3sys4unix4args3imp15ARGV_INIT_ARRAY17h244e25de9c1d3c88E
0000000000000000 T rust_eh_personality

Most defined symbols are localized. I do not know why the 4 symbols are special and rustc lto doesn't localize them.

I don't know how to rerun the rustc with a locally built (./x.py build with config.toml containing [rust]\ndebug=true).
I suspect that preventing ___asan_globals_registered from being localized will fix this bug.

@MaskRay
Copy link
Contributor

MaskRay commented Jul 12, 2023

-fcommon is consider bad nowadays but the ___asan_globals_registered COMMON symbol use case is fine. It is like a COMDAT group containing just a variable. COMMON is more size efficient than using a COMDAT group (there is a size overhead due to a 64 byte Elf64_Shdr header).

If we use a COMDAT for ___asan_globals_registered but rustc compiler/rustc_codegen_llvm/src/back/lto.rs still localizes the symbol, then we'd still have this bug.

@ChrisDenton ChrisDenton added the needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. label Jul 16, 2023
@Noratrieb Noratrieb added A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-sanitizers Area: Sanitizers for correctness and code quality. and removed needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. labels Jul 17, 2023
aarongable pushed a commit to chromium/chromium that referenced this issue Jul 19, 2023
fuchsia-fyi-x64-asan build is broken due to detect_odr_violation in
rust.[1] and search for "odr-violation" in [2].

It's a known issue [3] and also impacts other platforms [4].

A previous attempt to disable the
sanitize-address-globals-dead-stripping
(in https://crrev.com/c/4690811) fixed media_unittests but broke
base_unittests. So I ended up deciding to follow the suggestion from
the sanitizer itself to use the environment variable.

Since there isn't a asan try, it takes 6+ hours to run, I have to
manually test the change locally. With this change, both media_unittests
and base_unittests are passing when is_asan = true in gn args.

[1]: https://chromium-swarm.appspot.com/task?id=6360bb22591abb10
[2]: https://cas-viewer.appspot.com/projects/chromium-swarm/instances/default_instance/blobs/be139e3b5b49518b8a83be3fcdc5c209dc2acc2c9f725c558dd3fc2aac93a558/404237?filename=emulator_log.serial
[3]: rust-lang/rust#113404
[4]: https://crbug.com/1459233

Bug: 1459233, 1465997
Change-Id: I36fcbb947f87bafaa618bb5de5d631ad895a1bac
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4688280
Reviewed-by: David Dorwin <ddorwin@chromium.org>
Reviewed-by: Arthur Eubanks <aeubanks@google.com>
Commit-Queue: Zijie He <zijiehe@google.com>
Cr-Commit-Position: refs/heads/main@{#1172603}
zeng450026937 pushed a commit to zeng450026937/build that referenced this issue Jul 24, 2023
fuchsia-fyi-x64-asan build is broken due to detect_odr_violation in
rust.[1] and search for "odr-violation" in [2].

It's a known issue [3] and also impacts other platforms [4].

A previous attempt to disable the
sanitize-address-globals-dead-stripping
(in https://crrev.com/c/4690811) fixed media_unittests but broke
base_unittests. So I ended up deciding to follow the suggestion from
the sanitizer itself to use the environment variable.

Since there isn't a asan try, it takes 6+ hours to run, I have to
manually test the change locally. With this change, both media_unittests
and base_unittests are passing when is_asan = true in gn args.

[1]: https://chromium-swarm.appspot.com/task?id=6360bb22591abb10
[2]: https://cas-viewer.appspot.com/projects/chromium-swarm/instances/default_instance/blobs/be139e3b5b49518b8a83be3fcdc5c209dc2acc2c9f725c558dd3fc2aac93a558/404237?filename=emulator_log.serial
[3]: rust-lang/rust#113404
[4]: https://crbug.com/1459233

Bug: 1459233, 1465997
Change-Id: I36fcbb947f87bafaa618bb5de5d631ad895a1bac
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4688280
Reviewed-by: David Dorwin <ddorwin@chromium.org>
Reviewed-by: Arthur Eubanks <aeubanks@google.com>
Commit-Queue: Zijie He <zijiehe@google.com>
Cr-Commit-Position: refs/heads/main@{#1172603}
NOKEYCHECK=True
GitOrigin-RevId: 78d978e2ea02c1703ed1c65e2da7f04eda84a4ed
@anforowicz
Copy link
Contributor

Let me try to help move this bug forward by attempting to answer some of the questions above. I have limited experience with linkers, asan, and compilers, so please shout if there are any mistakes below.


RE: @MaskRay: I don't know how to rerun the rustc with a locally built (./x.py build)

I tweaked the original cargo cmdline (from the first comment/report on this bug) by adding RUSTC=<path to locally built rustc> and adding -Zbuild-std --target x86_64-unknown-linux-gnu (and other than that I've rerun the original repro steps + debugging steps from your earlier comment at #113404 (comment)):

RUSTC=$HOME/src/github/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc RUSTFLAGS="-Zsanitizer=address" CARGO_PROFILE_RELEASE_LTO=true cargo +nightly build --release -Zbuild-std --target x86_64-unknown-linux-gnu

(Note that this changes the path where the build artifacts are - e.g. target/x86_64-unknown-linux-gnu/release/deps/libbuiltins_static-9bce9c00959dd948.a instead of target/release/deps/libbuiltins_static-9ef9aaa79a14308e.a)


RE: @MaskRay: I do not know why the 4 symbols are special and rustc lto doesn't localize them.

I am not sure if these are the reasons, but this is what I've found for some of the symbols emitted by llvm-nm -gU ...:

  • I see that rust/compiler/rustc_codegen_llvm/src/context.rs marks the EH personality with llvm::UnnamedAddr::Global

  • AFAIU _ZN3std3sys4unix4args3imp15ARGV_INIT_ARRAY17h244e25de9c1d3c88E corresponds to the static initializer in rust/library/std/src/sys/unix/args.rs which is marked as #[used]

  • I see that BUILTINSC_GetFunctionList comes from the repro and is declared as #[no_mangle] which AFAIU tells rustc to export the function.

    • FWIW, I also see that rust/compiler/rustc_codegen_ssa/src/back/symbol_export.rs special-cases "rust_eh_personality" (see here), but changing this to cover names with "asan" substring didn't seem to have an effect on the llvm-nm -gU ... output)

RE: @MaskRay: I suspect that preventing ___asan_globals_registered from being localized will fix this bug.

Can you please elaborate on that? From your #113404 (comment), it seems that you are saying that a change in rustc is needed - did I get that right?

OTOH, it seems that ___asan_globals_registered comes from outside of rustc sources - it comes from llvm-project/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp and therefore it seems that maybe we need to change how it is marked for linking by LLVM? Interestingly the comment there mentions "a local symbol" - not sure if this is relevant:

// ASan version script has __asan_* wildcard. Triple underscore prevents a
// linker (gold) warning about attempting to export a local symbol.
const char kAsanGlobalsRegisteredFlagName[] = "___asan_globals_registered";

@anforowicz
Copy link
Contributor

anforowicz commented Jul 27, 2023

/cc @eugenis who AFAICT added the kAsanGlobalsRegisteredFlagName comment above in llvm/llvm-project@964f466#diff-74cc02fff013e3ab08aefcad70e555b1177549f8403d03395c856c3f8c5b5875

@zmodem
Copy link
Contributor

zmodem commented Aug 8, 2023

/cc @tru fyi. Although this isn't fully understood and it may well be that the fix will be on the rust side, this is a rustc/clang interop issue that's new with the upcoming clang 17 release, so perhaps there should at least be a "known issues" release note.

@tru
Copy link

tru commented Aug 8, 2023

cc @nikic

@nikic
Copy link
Contributor

nikic commented Aug 8, 2023

Probably the symbol needs to be added to the symbol export list, see existing handling for __llvm_profile and __msan symbols in

if tcx.sess.instrument_coverage() || tcx.sess.opts.cg.profile_generate.enabled() {
.

@anforowicz
Copy link
Contributor

@nikic - thanks for the pointer! After adding the following to compiler/rustc_codegen_ssa/src/back/symbol_export.rs:

if tcx.sess.opts.unstable_opts.sanitizer.contains(SanitizerSet::ADDRESS) {
    // Similar to profiling, preserve weak asan symbols during LTO.
    const ASAN_WEAK_SYMBOLS: [&str; 1] = ["___asan_globals_registered"];

    symbols.extend(ASAN_WEAK_SYMBOLS.into_iter().map(|sym| {
        let exported_symbol = ExportedSymbol::NoDefId(SymbolName::new(tcx, sym));
        (
            exported_symbol,
            SymbolExportInfo {
                level: SymbolExportLevel::C,
                kind: SymbolExportKind::Data,
                used: false,
            },
        )
    }));
}

I am now seeing ___asan_globals_registered in the output of llvm-nm (which I assume means that the problem is fixed based on #113404 (comment)):

$ $HOME/src/github/rust/build/x86_64-unknown-linux-gnu/llvm/bin/llvm-nm -gU builtins_static-9bce9c00959dd948.builtins_static.29256d2b2790aa9a-cgu.1.rcgu.o
0000000000000000 T BUILTINSC_GetFunctionList
0000000000000000 V DW.ref.rust_eh_personality
0000000000000000 D _ZN3std3sys4unix4args3imp15ARGV_INIT_ARRAY17h45459e0562778ae4E
0000000000000008 C ___asan_globals_registered
0000000000000000 T rust_eh_personality

Repro steps for completeness:

  1. cd ~/src/github/rust
  2. edit compiler/rustc_codegen_ssa/src/back/symbol_export.rs
  3. ./x build
  4. cd ~/scratch/rustc-test/nss-builtins
  5. rm -rf target
  6. RUSTC=$HOME/src/github/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc RUSTFLAGS="-Zsanitizer=address" CARGO_PROFILE_RELEASE_LTO=true cargo +nightly build --release -Zbuild-std --target x86_64-unknown-linux-gnu
  7. ar t target/x86_64-unknown-linux-gnu/release/deps/libbuiltins_static-9bce9c00959dd948.a | grep builtins_static
  8. ar x target/x86_64-unknown-linux-gnu/release/deps/libbuiltins_static-9bce9c00959dd948.a builtins_static-9bce9c00959dd948.builtins_static.29256d2b2790aa9a-cgu.1.rcgu.o
  9. $HOME/src/github/rust/build/x86_64-unknown-linux-gnu/llvm/bin/llvm-nm -gU builtins_static-9bce9c00959dd948.builtins_static.29256d2b2790aa9a-cgu.1.rcgu.o

I'll try to put together a PR with the changes above.

I think one of the first steps is to figure out if/how to add tests for this. I'll try to see if I can cargo cult something from either e2acaee (Add codegen test that makes sure PGO instrumentation is emitted as expected), 4053e25 (librustc_trans: Mark some profiler symbols as exported to avoid LTO removing them), d8c661a (Mark __msan_keep_going as an exported symbol for LTO), or 2c0845c (Mark __msan_track_origins as an exported symbol for LTO by @nikic - thanks again for the code pointer!),

FWIW, I think it would be nice to consolidate the code from the 3 similar cases: instrument_coverage() || profile_generate.enabled(), SanitizerSet::MEMORY, and SanitizerSet::ADDRESS. I plan to have a separate commit that refactors gathering all symbol names into a single vector and then uses a single/shared symbols.extend(...iter().map(|sym| { ... }).

Finally, I am a bit surprised that compiler/rustc_codegen_ssa/src/back/symbol_export.rs needs to be aware of symbol names used in a far-away LLVM land where AFAIU ASAN and MSAN are implemented. This arrangement seems fragile. OTOH, it seems that this does fix the issue at hand, so it seems like a reasonable way to proceed.

@nikic
Copy link
Contributor

nikic commented Aug 8, 2023

Finally, I am a bit surprised that compiler/rustc_codegen_ssa/src/back/symbol_export.rs needs to be aware of symbol names used in a far-away LLVM land where AFAIU ASAN and MSAN are implemented. This arrangement seems fragile. OTOH, it seems that this does fix the issue at hand, so it seems like a reasonable way to proceed.

This is because Rust assumes that all code provided to (non-plugin) LTO comes from Rust, so it knows about all symbols that are involved. This doesn't hold up for symbols that get injected by LLVM, so they need to be special-cased.

@anforowicz
Copy link
Contributor

anforowicz commented Aug 8, 2023

TL;DR: I would appreciate help with creating a regression test for this.


I have trouble creating a regression test for this. Based on the contributor docs, FileCheck tests under tests/codegen run with --emit=llvm-ir and it seems that exactly the same LLVM IR is present in rust/build/x86_64-unknown-linux-gnu/test/codegen/sanitizer/address-sanitizer-globals-tracking/address-sanitizer-globals-tracking.ll before and after the changes from #113404 (comment). In other words, the following test (say, added in tests/codegen/sanitizer/address-sanitizer-globals-tracking.rs) passes before and after the change:

// Verifies that AddressSanitizer symbols show up as expected in LLVM IR
// with -Zsanitizer (DO NOT SUBMIT: consider adding non-LTO and fat LTO test flavours/revisions?)
//
// needs-sanitizer-address
// compile-flags: -Zsanitizer=address -C lto

#![crate_type="lib"]

// The test below mimics `CACHED_POW10` from `library/core/src/num/flt2dec/strategy/grisu.rs` which
// (because of incorrect handling of `___asan_globals_registered` during LTO) was incorrectly
// reported as an ODR violation in https://crbug.com/1459233#c1.
//
// See https://github.com/rust-lang/rust/issues/113404 for more discussion.
//
// CHECK: @___asan_globals_registered = common hidden global i64 0
// CHECK: @__start_asan_globals = extern_weak hidden global i64
// CHECK: @__stop_asan_globals = extern_weak hidden global i64
pub static CACHED_POW10: [(u64, i16, i16); 4] = [
    (0xe61acf033d1a45df, -1087, -308),
    (0xab70fe17c79ac6ca, -1060, -300),
    (0xff77b1fcbebcdc4f, -1034, -292),
    (0xbe5691ef416bd60c, -1007, -284),
];                                                                                                 

AFAICT the test above replicates the following rustc cmdline flags from #113404 (comment): -Zsanitizer=address, -C lto. It doesn't replicate --crate-type staticlib, but I haven't pursued this direction further, because #![crate_type="staticlib"] necessitates -C prefer-dynamic=false and no longer includes ___asan_globals_registered in the LLVM IR. The presence or absence of -C opt-level=3 doesn't seem to change the result here.

I notice that #113404 (comment) (and my repro steps in #113404 (comment)) reported seeing --emit=dep-info,link rather than --emit=llvm-ir. So maybe my changes only have effect on stages after LLVM-IR generation? I dunno... Not sure if this hypothesis makes sense...

@tmiasko
Copy link
Contributor

tmiasko commented Aug 8, 2023

When a module doesn't exports any symbols, getUniqueModuleId returns an empty string and instrumentation no longer uses ___asan_globals_registered:

https://github.com/rust-lang/llvm-project/blob/7c612e1732f3976fcfe29526ad796cbb6174b829/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L2524-L2526

You can export an arbitrary symbol. For example mark CACHED_POW10 as #[no_mangle] and then use --crate-type=staticlib.

@anforowicz
Copy link
Contributor

@tmiasko - thanks! I have a working test now :-).

@rnk
Copy link

rnk commented Aug 9, 2023

When a module doesn't exports any symbols, getUniqueModuleId returns an empty string and instrumentation no longer uses ___asan_globals_registered:

I have a patch lying around to try to simplify this logic, but I haven't prioritized it because this issue is still outstanding and I don't want to complicate matters while we look for a solution.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 11, 2023
Mark `___asan_globals_registered` as an exported symbol for LTO

Export a weak symbol defined by AddressSanitizer instrumentation.
Previously, when using LTO, the symbol would get internalized and eliminated.

Fixes rust-lang#113404.

---------------

FWIW, let me list similar PRs from the past + who reviewed them:

* rust-lang#68410 (fixing `__msan_keep_going` and `__msan_track_origins`; reviewed by `@varkor)`
* rust-lang#60038 and rust-lang#48346 (fixing `__llvm_profile_raw_version` and `__llvm_profile_filename`; reviewed by `@alexcrichton)`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 12, 2023
Mark `___asan_globals_registered` as an exported symbol for LTO

Export a weak symbol defined by AddressSanitizer instrumentation.
Previously, when using LTO, the symbol would get internalized and eliminated.

Fixes rust-lang#113404.

---------------

FWIW, let me list similar PRs from the past + who reviewed them:

* rust-lang#68410 (fixing `__msan_keep_going` and `__msan_track_origins`; reviewed by ``@varkor)``
* rust-lang#60038 and rust-lang#48346 (fixing `__llvm_profile_raw_version` and `__llvm_profile_filename`; reviewed by ``@alexcrichton)``
@anforowicz
Copy link
Contributor

Thanks @tmiasko! I've added only-linux (plus an explanatory comment) to the new test. Feel free to ask bors to commit if this LGTY.

I note that, since we are saying that ___asan_globals_registered is ELF-specific, then maybe we should consider only applying the product code changes to ELF targets. Still, maybe recognizing the ___asan_globals_registered symbol on other targets won't hurt (and it definitely simplifies the code). FWIW, I see that codegen_attrs.rs has some is_like_elf-dependent behavior, but I am not sure if this should also be used in the ___asan_globals_registered PR.

@anforowicz
Copy link
Contributor

The latest test failure (see #114642 (comment)) on x86_64-apple-1 is:

---- [run-make] tests/run-make/sanitizer-cdylib-link stdout ----
--- stdout -------------------------------
...
  = note: Undefined symbols for architecture x86_64:
            "____asan_globals_registered", referenced from:
               -exported_symbol[s_list] command line option
          ld: symbol(s) not found for architecture x86_64
          clang: error: linker command failed with exit code 1 (use -v to see invocation)
...

This seems to indicate that the product code changes should not unconditionally expect the ___asan_globals_registered symbols whenever ASAN is used, but instead the presence of the symbol should depend on the target architecture. This is a little bit tricky - I see that llvm-project/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp refers to kAsanGlobalsRegisteredFlagName from:

void ModuleAddressSanitizer::InstrumentGlobalsELF(...) {
  ...
  // RegisteredFlag serves two purposes. First, we can pass it to dladdr()
  // to look up the loaded image that contains it. Second, we can store in it
  // whether registration has already occurred, to prevent duplicate
  // registration.
  //
  // Common linkage ensures that there is only one global per shared library.
  GlobalVariable *RegisteredFlag = new GlobalVariable(
      M, IntptrTy, false, GlobalVariable::CommonLinkage,
      ConstantInt::get(IntptrTy, 0), kAsanGlobalsRegisteredFlagName);
  RegisteredFlag->setVisibility(GlobalVariable::HiddenVisibility);
  ...
}
...
void ModuleAddressSanitizer::InstrumentGlobalsMachO(...) {
  ...
  // RegisteredFlag serves two purposes. First, we can pass it to dladdr()
  // to look up the loaded image that contains it. Second, we can store in it
  // whether registration has already occurred, to prevent duplicate
  // registration.
  //
  // common linkage ensures that there is only one global per shared library.
  GlobalVariable *RegisteredFlag = new GlobalVariable(
      M, IntptrTy, false, GlobalVariable::CommonLinkage,
      ConstantInt::get(IntptrTy, 0), kAsanGlobalsRegisteredFlagName);
  RegisteredFlag->setVisibility(GlobalVariable::HiddenVisibility);
  ...
}

So, based on the above bf9820e#diff-6d46444b86c506127f8dd251a0c949845de78baec01de25b7524b3b10d90efd7 should maybe be modified to only recognize ___asan_globals_registered for ELF and MachO targets.

@anforowicz
Copy link
Contributor

I don't know how to detect ELF or MachO targets based on tcx (this would be required if we wanted to pursue the direction outlined at the end of #113404 (comment)). There is tcx.sess.target (Target which also implements a Deref for TargetOptions), but it's unclear to me how to map this into ELF-or-MachO decision.

This hurdle also reinforces the feeling that I've expressed in #113404 (comment): it feels fragile that we have to keep rustc aware of how instrumentation symbols are used in a far-away LLVM land. I would hope that the information about these symbols can be communicated by LLVM (instead of having to duplicate/replicate this information in rustc sources). OTOH, I have very little experience working with linkers, so I don't know if a viable alternative exists.

@tmiasko, @nikic, @rnk, and/or @MaskRay - could you please comment on the above? Do we really want/need rust/compiler/rustc_codegen_ssa/src/back/symbol_export.rs to replicate the knowledge from the (far-away land of) llvm-project/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp that the ___asan_globals_registered symbol exists on ELF and MachO targets and that rustc should handle it with specific SymbolExportInfo?

@anforowicz
Copy link
Contributor

RE: @nikic: #113404 (comment): Rust assumes that all code provided to (non-plugin) LTO comes from Rust

Would it be possible for rustc to detect symbols injected by LLVM (like ___asan_globals_registered, __msan_keep_going, or __llvm_profile_filename) more generically (without knowing about specific symbol names)? If there was a way to see the list of all symbols (in compiler/rustc_codegen_ssa/src/back/symbol_export.rs? in compiler/rustc_codegen_llvm/src/back/lto.rs?) then maybe these LLVM-origin symbols could be detected by special-casing their prefix (__llvm, __msan, __asan, or ___asan) rather than their exact names?

I tried staring for a while at fn exported_symbols_provider_local and I don't currently see a way to go from its input (i.e. tcx: TyCtxt) to the list of all symbols (including ones injected by LLVM). FWIW, I also don't understand how symbols injected by LLVM (e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L2216) present themselves at the rustc API level (I guess they are not "items" in rustc sense and don't have a DefId).

@rnk
Copy link

rnk commented Aug 17, 2023

it feels fragile that we have to keep rustc aware of how instrumentation symbols are used in a far-away LLVM land.

With an understanding that we want to work together and find a practical solution, let me first represent the perspective of the instrumentation tool writer. From the PoV of one writing instrumentation passes, it is surprising that Rust assumes that all symbols originate from Rust, and that instrumentation passes cannot add symbols and rely on the standard linkage rules. From the ASan PoV, all we are doing is inventing a common symbol and assuming that linking will proceed as normal, meaning the common symbol will be deduplicated, and there will be one per DSO. I would argue that this Rust pass is breaking fairly reasonable assumptions.

But I'm more interested in practical solutions, so yes, I think using those prefixes would be a good place to start. I'm not sure that list is complete, however.

  • PGO, for examples, creates lots of symbols using a __prof[dc] prefix. I see the check for __llvm_profile_raw_version already exists in export_symbols.rs
  • I could imagine that sanitizers may start moving some shared sanitizer functionality from the __[amt]san_* prefix space to something like __sanitizer_common_*, and there's some fragility there, but that's hypothetical.
  • There is also the sanitizer coverage mode, which uses __sancov_*.

We could go the direction of having LLVM instrumentation passes annotate symbols somehow, but it would be difficult to enforce that as a requirement for instrumentation passes. Anyone writing a new instrumentation pass is likely to discover the annotation requirement when they first try to instrument Rust code, which doesn't reduce the cost of this issue, it just distributes the fixes from export_symbols.rs to each instrumentation pass.

You may be able to put together some heuristic by looking at global variables with weak linkages (common, linkonce_odr, weak, weak_odr, etc). If Rust never or rarely produces weak symbols, you could teach the exporter to leave those alone. That would be the simplest.

@bjorn3
Copy link
Member

bjorn3 commented Aug 17, 2023

I would argue that this Rust pass is breaking fairly reasonable assumptions.

For the staticlib and cdylib crate types we need to hide all symbols except those marked with #[no_mangle] to prevent conflicts between multiple copies of rust crates (like the standard library itself) that are statically linked into them.1 We are using the same list of symbols to be exported for LTO as these symbols definitively need to be exported and not exporting the rest as far as LTO is concerned will allow for the most optimizations. The staticlib and cdylib crate types use a closed world model: All rust code is contained in them and only an explicitly specified C interface is exported. #114642 simply extends the C interface it exports.

Footnotes

  1. For staticlibs we currently don't do this as there is no way to do this without editing the object files themself it seems. (hidden visibility wouldn't work as applying hidden visibility to the standard library would break dynamic linking) This currently makes it to be impossible to link two rust staticlibs into the same executable except when using a linker which allows multiple identical definitions of a symbol and even then multiple versions of a crate would break.

@anforowicz
Copy link
Contributor

AFAICT the initial fix attempt at bf9820e influences whether prepare_lto from rust/compiler/rustc_codegen_llvm/src/back/lto.rs contains ___asan_globals_registered in the symbols_below_threshold that get returned as the first tuple item. symbols_below_threshold are then passed to LLVMRustRunRestrictionPass in rust/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp and used in the PreserveFunction predicate.

So, let me try a different fix approach and recognize ASAN/MSAN/etc-related symbols in the PreserveFunction predicate (rather than in exported_symbols_provider_local as in the original fix attempt). See: #114946

@anforowicz
Copy link
Contributor

I guess one could argue about pushing the knowledge about __asan, __msan, etc. prefixes even further down - e.g. into the implementation of InternalizePass::internalizeModule or InternalizePass::shouldPreserveGV in LLVM. @rnk and/or @tmiasko - any strong opinions on continuing with the rustc PR at #114946 VS authoring an LLVM PR instead?

@rnk
Copy link

rnk commented Aug 17, 2023

Putting this list of symbols in LLVM sounds practical to me. Is it possible to emulate what Rust is doing using opt -internalize? If so, we could make a small regression test out of this, and put it in the asan test suite.

@anforowicz
Copy link
Contributor

anforowicz commented Aug 17, 2023

Hmmm... I now think that it still makes sense to review and attempt to land the rustc PR at https://github.com/rust-lang/rust/pull/114946I even if we (eventually) preserve these ASAN symbols via a separate LLVM PR:

  • At least part of the rustc PR at Preserve ASAN-related symbols during LTO. #114946 is desirable in the long-term (i.e. even once LLVM changes are made). Specifically, we want to land the changes that make symbol_export.rs unaware of __llvm_profile_raw_version, __llvm_profile_filename, __msan_keep_going, __msan_track_origins names (i.e. keep the removal of code in 2c75640#diff-6d46444b86c506127f8dd251a0c949845de78baec01de25b7524b3b10d90efd7).
  • It seems desirable for rustc to work correctly even if built with older LLVM versions
  • We can consider opening a follow-up issue against the llvm-project (and maybe adding a TODO in the rustc PR?)

RE: @rnk: #113404 (comment): Is it possible to emulate what Rust is doing using opt -internalize?

Maybe. I don't know how to check. Sorry.

FWIW, I see that LLVM-level tests that use opt -internalize take LLVM-IR as input. This probably means that ___asan_globals_registered would have to be hardcoded/simulated in the test input (rather than generated by the real ASAN). Maybe this is ok.


RE: @rnk: #113404 (comment): Putting this list of symbols in LLVM sounds practical to me.

The llvm-project/llvm/lib/Transforms/IPO/Internalize.cpp source file has quite elaborate code for preserving some symbols. I wonder if ASAN / MSAN / etc symbols should be protected by any of the existing mechanisms:

  • Would we consider adding ___asan_globals_registered, __msan_track_origins, etc. into InteranlizePass::AlwaysPreserved (e.g. somewhere here)
  • Or maybe the already-existing preservation of Used variables should kick-in (but for some reason doesn't work for ASAN/MSAN/etc)?
  • Or maybe the already existing checks in InternalizePass::shouldPreserveGV should kick-in (but for some readon don't work for ASAN/MSAN/etc). Or maybe these checks need to be extended in a generic way (rather than teaching Internalize.cpp about __asan and/or __msan prefixes).

@anforowicz
Copy link
Contributor

anforowicz commented Aug 25, 2023

Status update / summary (I edited this comment on 2023-08-25 at 10:47 PST to add one other potential next step at the very end of the comment):

Can we discuss the next steps?:

@anforowicz
Copy link
Contributor

@bjorn3, in #114946 (comment) you've suggested that there might be issue with Option 2 of the fix. Let me partially reply here, because I have some questions that are not related to Option 2.

Do you think we should proceed with Option 1 (or is there another approach that you'd suggest)? If so, then I assume that you agree that to avoid the x86_64-apple-1 test failures we should only inject __asan_globals_registered in exported_symbols_provider_local when the symbol is actually present (i.e. when ModuleAddressSanitizer::InstrumentGlobalsELF and/or ModuleAddressSanitizer::InstrumentGlobalsMachO inject the symbol). Could you please help me understand how to tweak #114642 to do this? How can exported_symbols_provider_local detect ELF and/or MachO targets? Do you think it would be okay to copy-and-paste let is_like_elf = ... from codegen_attrs.rs? What would you suggest for detecting MachO targets? (I don't see code for detecting MachO targets outside of compiler/rustc_codegen_cranelift/src/lib.rs which I assume can't be used outside of cranelist.)

@bjorn3
Copy link
Member

bjorn3 commented Aug 26, 2023

Do you think we should proceed with Option 1

I think so, but I may be misunderstanding what exactly address sanitizer needs in terms of linkage.

How can exported_symbols_provider_local detect ELF and/or MachO targets?

Like this:

let binary_format = if sess.target.is_like_osx {
BinaryFormat::MachO
} else if sess.target.is_like_windows {
BinaryFormat::Coff
} else if sess.target.is_like_aix {
BinaryFormat::Xcoff
} else {
BinaryFormat::Elf
};

Maybe extracting this into a function to ensure it is kept in sync makes sense.

anforowicz added a commit to anforowicz/rust that referenced this issue Aug 28, 2023
@anforowicz
Copy link
Contributor

How can exported_symbols_provider_local detect ELF and/or MachO targets?

Like this:

...

Maybe extracting this into a function to ensure it is kept in sync makes sense.

Thanks! That helps :-). Not sure how I managed to miss this piece of code when greepping the source code for BinaryFormat... :-/

FWIW I've applied the changes suggested above to the newly pushed version of #114642 (e.g. see d9abc46#diff-aa810a3be0834da891b171a8b04b09d0d3bbb76ac27c757cd232235099e062d2)

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 6, 2023
… r=tmiasko

Preserve ASAN-related symbols during LTO.

Fixes rust-lang#113404
@bors bors closed this as completed in e6dddbd Sep 6, 2023
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries A-sanitizers Area: Sanitizers for correctness and code quality. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet