-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove useless #[global_allocator]
from rustc and rustdoc.
#92222
Remove useless #[global_allocator]
from rustc and rustdoc.
#92222
Conversation
This was added in rust-lang#83152, which has several errors in its comments. This commit also fix up the comments, which are quite wrong and misleading.
The knowledge embodied in this PR was hard-earned: I spent more than a day trying to do some jemalloc experiments, horribly confused as to why the tikv-jemallocator crate's impl of I have confirmed on Linux that jemalloc is still being used (when |
BTW, this PR touches on the reasons why rustc does not and cannot use jemalloc via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bors: r+ rollup=never
I figure it's good to keep this out of rollups to ensure it's got a separate perf measurement, but I wouldn't really expect any measurement changes as a result of this.
I was gonna write up how I think we could hook into sized deallocation but turns out I already did that, yay!
// A note about jemalloc: rustc uses jemalloc when built for CI and | ||
// distribution. The obvious way to do this is with the `#[global_allocator]` | ||
// mechanism. However, for complicated reasons (see | ||
// https://github.com/rust-lang/rust/pull/81782#issuecomment-784438001 for some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I had written up a comment on some of this awhile back and lo-and-behold, thanks for linking this here!
@bors: r+ rollup=never |
📌 Commit bb23bfc has been approved by |
Thank you for the fast review.
Unfortunately I'm stuck on step 1: "First probably measure the impact to see whether it's at all worth it." Hence my question above. |
Oh dear sorry I didn't read close enough! I think you may actually be able to hack around things with Linux symbol resolution in dynamic libraries. For example this program: use std::ptr;
fn main() {
let _x = Box::new(1);
}
#[no_mangle]
pub unsafe extern "C" fn __rust_alloc(size: usize, align: usize) -> *mut u8 {
eprintln!("hello");
ptr::null_mut()
} yields:
so I think if you define these symbols in the rustc executable it might work? Your implementation when then forward to Note that this certainly won't work for Windows, may or may not work for macOS (unsure about the dynamic symbol resolution rules there), and is kind of a weird hack for Linux. I think this might work for experimenting but I wouldn't necessarily recommend landing it! |
☀️ Test successful - checks-actions |
Finished benchmarking commit (d6d12b6): comparison url. Summary: This benchmark run did not return any relevant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
Your suggestion worked! Thank you. For posterity, here is my diff:
I used this to measure the effect of sized deallocations. It ended up being a slowdown. Here's the top part of the instruction counts table:
There were lots more entries in the 0.3-1.8% range after that. Here is the top of the cachegrind diff (lightly edited) for a
It's clear from this that the hack worked -- you can see negative counts for The main source of additional instructions is the
The other significant source of additional instructions is the 47,350,395 from |
Nice! I'm glad the test was done first before the design in this case... A bit of a surprising result, though. Even if the |
So surprising in fact, that perhaps we should tell the jemalloc devs over at https://github.com/jemalloc/jemalloc to investigate further... |
I double checked this in #92548 and got very similar results on CI to what I measured locally, which was a good sanity check. |
@davidtgoldblatt @interwq |
Thanks for letting us know @LifeIsStrange. A few factors might explain the slowdown:
cc: @Lapenkov |
@interwq As far as I know none of the things you suggested are happening. rustc is fairly allocation heavy. Like any compiler, it has lots of heterogenous tree structures. This comment has some analysis in its second half of the sized deallocation change. Much of the slowdown is caused by an additional layer of Rust code in tikv-jemallocator being run in front of each allocation and deallocation, which isn't jemalloc's fault and isn't that surprising. What is more surprising is that |
Agree that the sdallocx not being faster part is unexpected. Can you grab a |
@interwq: I have attached the Cachegrind annotations of Both profiles show identical paths through Does that make sense? Is setting |
@nnethercote : thanks for the info! Yes that explains it -- sorry that I forgot in the last release CACHE_OBLIVIOUS also plays a role here. That requirement got optimized away in jemalloc/jemalloc#1749 If you'd like, feel free to try the current jemalloc dev branch: jemalloc/jemalloc@f509703 In the meantime we are preparing the upcoming 5.3 release, which will be based on the commit above. re: |
I tried the current jemalloc dev branch. Compared to 5.2 it shows a small but clear reduction in instruction counts for rustc, up to 1.44%. Good to know for the future. I also tried doing the sized deallocation thing. Results were better than before but still a net slowdown, mostly because of the cost of the |
@nnethercote @guswynn Jemalloc 5.3.0 has been officially released as stable! https://github.com/jemalloc/jemalloc/releases/tag/5.3.0 |
Note: |
Thanks, @lqd ! |
This was added in #83152, which has several errors in its comments.
This commit also fix up the comments, which are quite wrong and
misleading.
r? @alexcrichton