Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make rustc use jemalloc through #[global_allocator] #51038

Closed
SimonSapin opened this issue May 24, 2018 · 16 comments
Closed

Make rustc use jemalloc through #[global_allocator] #51038

SimonSapin opened this issue May 24, 2018 · 16 comments
Labels
A-allocators Area: Custom and system allocators C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@SimonSapin
Copy link
Contributor

Once the #[global_allocator] attribute and the GlobalAlloc trait are stable (#49668), we plan to make the system allocator the default for executables instead of jemalloc, and remove jemalloc from the standard library: #36963 / #27389.

Presumably, we want rustc to keep using jemalloc since it often performs better, for rustc’s typical workload. Like other programs, it can do so using #[global_allocator] and the jemallocator crate or similar. (And make jemalloc symbols be unprefixed so that they get picked up by LLVM too, but I don’t forsee a difficulty adding a Cargo feature to jemallocator for this.)

However, per #45966 (comment) this might not work because rustc itself is compiled with -C prefer-dynamic and links to the standard library dynamically.

@alexcrichton, are -C prefer-dynamic and #[global_allocator] fundamentally incompatible or is this something we can fix? Is it possible to make rustc not use -C prefer-dynamic?

CC @gnzlbg, @rust-lang/compiler

@SimonSapin SimonSapin added A-allocators Area: Custom and system allocators T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 24, 2018
@gnzlbg
Copy link
Contributor

gnzlbg commented May 24, 2018

This is not how it is implemented in Rust AFAIK, but my model coming from C++ would be that:

  • There is one global extern function per method in the GlobalAlloc trait
  • Everything uses the global allocator through these functions and is compiled against them (the std lib, rustc, ...)
  • A crate containing a #[global_allocator] contains these functions (e.g. generated by the compiler) which just call the trait methods of the global allocator in that crate

If there is a #[global_allocator] in the dependency graph, we can just assume that the user wanted to link an allocator statically.

If there is no #[global_allocator] present in the dependency graph, if the user wants to link the allocator

  • statically: we add the system allocator crate (or some other crate) to the dependency graph, such that the required extern functions are present. LTO should allow inlining through the extern functions in this case.
  • dynamically: we do nothing. A dynamic library with those symbols present must be linked at run-time.

And that's it? If the functions are present in the binary, they can be optimized through LTO, and if they aren't, the binary must be dynamically linked to something that has them or it won't run.


This adds a couple of constraints on the GlobalAlloc trait, like it cannot provide generic methods, because those cannot be dynamically linked, but I think that's alright.

@sfackler
Copy link
Member

This is not how it is implemented in Rust AFAIK

That's exactly how it's implemented in Rust: https://github.com/rust-lang/rust/blob/master/src/liballoc/alloc.rs#L25-L38

@alexcrichton
Copy link
Member

The technical probably here is, yes, libstd is built as a dynamic library for the compiler. This dynamic library, especially on Windows, needs to have all symbols resolved. This means that when we build libstd itself we must choose an allocator. Currently we aren't set up to do two separate builds of libstd, one for an rlib and one for a dylib.

So it's a hard technical requirement that the first dylib we compile must have all symbols resolved, and currently that means that it must select an allocator. We may be able to get away without compiling libstd as a dylib and just dealing with rustc, but that runs a high risk of being a breaking change.

A possible solution is basically just hacking around everything here in rustc... somehow. Like basically adding hardcoded logic that libstd's dylib links to jemalloc while the rlib explicitly doesn't link to jemalloc (or something like that)

@SimonSapin
Copy link
Contributor Author

Does this mean that #[global_allocator] is fundamentally incompatible with dynamic linking?

@retep998
Copy link
Member

Well on Windows if the global allocator is in a .dll then you can just swap it out for a different .dll with the same ABI and same name but a different global allocator.

@alexcrichton
Copy link
Member

Historically the system allocator and jemalloc had different sets of symbols associated with them and we choose "at the last minute" which to route the main allocation symbols to. Nowadays though with #[global_allocator] I don't think this is necessary any more and #[global_allocator] can probably generate the symbols that liballoc expects.

In that sense I think this will be possible by building libstd.dylib with a dynamic dependency on these symbols. Ideally we'd do something like include the symbols in libstd.dylib but force all function calls to go through the dynamic linker still. That way we could still load jemalloc but it wouldn't be required. I'm not sure how plausible that is though for all platforms.

bors added a commit that referenced this issue Jul 3, 2018
[Do not merge yet] Remove alloc_jemalloc, switch the default global allocator to System

Fixes #36963

**Do not merge** yet. This PR by itself is likely to regress rustc performance. The purpose of opening it now is to measure by how much. We’ll likely want to figure out #51038 and land them around the same time.
@SimonSapin
Copy link
Contributor Author

It just occurred to me that rustc is already not using jemalloc on Windows, so only Unix-like platforms are relevant to this issue.

@SimonSapin
Copy link
Contributor Author

I tried the "obvious" patch (on top of #52020):

diff --git a/src/librustc_driver/lib.rs b/src/librustc_driver/lib.rs
index 84f7b35d21..c7e9fc77ce 100644
--- a/src/librustc_driver/lib.rs
+++ b/src/librustc_driver/lib.rs
@@ -33,6 +33,8 @@ extern crate arena;
 extern crate getopts;
 extern crate graphviz;
 extern crate env_logger;
+#[cfg(not(windows))]
+extern crate jemallocator;
 #[cfg(unix)]
 extern crate libc;
 extern crate rustc_rayon as rayon;
@@ -118,6 +120,11 @@ pub mod driver;
 pub mod pretty;
 mod derive_registrar;
 
+#[cfg(not(windows))]
+#[cfg(not(stage0))]
+#[global_allocator]
+static A: jemallocator::Jemalloc = jemallocator::Jemalloc;
+
 pub mod target_features {
     use syntax::ast;
     use syntax::symbol::Symbol;

It seems to work perfectly on Linux. The executable and every .so file all have their own __rust_alloc symbol. Most of them call __rdl_alloc (system, the default), the ones in the executable and in librustc_driver*.so call __rg_alloc (jemallocator, through #[global_attribute]). I assume that the symbol in the executable "wins" somehow, since when running that rustc in gdb and breaking on __rust_alloc I always end up in code that then calls __rg_alloc.

It doesn’t work at all on macOS. The symbols present in various files look similar, but when running under lldb most calls to __rust_alloc seem to go to a version that calls __rdl_alloc. Only allocations made within the rustc_driver crate seem to be correctly routed through the #[global_allocator] attribute and __rg_alloc.

Based on these observation I supposed that current versions of rustc might have the same problem because the symbol and linking setup is pretty much the same. Maybe rustc on macOS doesn’t actually use liballoc_jemalloc? But somehow that’s not the case, and __rust_alloc does go to __rde_alloc and jemalloc. I don’t understand why my branch is different.

@glandium
Copy link
Contributor

glandium commented Jul 7, 2018

jemalloc itself hooks up the system allocator on osx, so even if rustc ends up using malloc, it ends up using jemalloc.

@alexcrichton
Copy link
Member

@SimonSapin hm while it may work on Linux all dynamic libraries having __rust_alloc symbols sounds somewhat scary to me in that it's a nasty bug in the making for down the road. It'd be disastrous, for example, for some dynamic libraries to accidentally use malloc where others use jemalloc on Linux.

I like @glandium's idea of just having jemalloc linked in on OSX to implement the malloc/free symbols with jemalloc. For our tier 1 platforms that just leaves us figuring out Linux as we're not using jemalloc on Windows.

For Linux I'm not really sure what the best option is here. I'd love to get to a point where we can simplify how the allocator symbols work out (avoid redirecting shims) and perhaps just explicitly leverage the dynamic linker shenanigans to get the job done. We just need to be careful here because it can in theory affect stable programs compiled against the libstd dynamic library, but I can't imagine there are many of those in existence...

@SimonSapin
Copy link
Contributor Author

while it may work on Linux all dynamic libraries having __rust_alloc symbols sounds somewhat scary to me in that it's a nasty bug in the making for down the road.

I agree, but isn’t this setup already the same today?

@alexcrichton
Copy link
Member

@SimonSapin ah indeed true! I think this is a bug though right now in that __rust_alloc should be a private dll-local symbol instead of an exported symbol (as it is today). Additionally all __rust_alloc definitions in all the dynamic libraries are the same, rather than having one different one that happens to trump the other ones.

@alexcrichton
Copy link
Member

@SimonSapin oh so here's an idea: One possibility would be to create something like a rustc_std crate in the distribution. This crate would be a dynamic library but would link everything statically (including libstd). Today libstd is the "base dynamic library" but for the compiler this would be the base dynamic library. All rustc dylibs would link to libstd through this library.

That way libstd's dll would default to the system allocator while librustc's "libstd" would link to jemalloc. I think that'd do the trick? That way we don't have to worry about duplicate symbols and such.

@SimonSapin
Copy link
Contributor Author

How would proc-macro crates fit into this? They’re compiled as dynamic libraries loaded in the same process as rustc, right?

@alexcrichton
Copy link
Member

Indeed yeah, but they currently link to libsyntax which is what defines the allocator, and in the future we can just make sure that the proc-macro crate type uses the same allocator as rustc (and/or the same set of runtime libraries)

@XAMPPRocky XAMPPRocky added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Oct 2, 2018
@alexcrichton
Copy link
Member

I'm gonna try to consolidate allocator and jemalloc/rustc related issues into #36963 as I think this is all basically enabled by one PR which would solve that issue. I'll be updating the OP there soon too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators Area: Custom and system allocators C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants