Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get LLVM using jemalloc and inline FNV hashing #31460

Merged
merged 3 commits into from
Feb 14, 2016

Conversation

alexcrichton
Copy link
Member

Looking at some profiles of rustc recently, these seemed like two pretty easy wins we could get in terms of performance on the table.

@dotdash
Copy link
Contributor

dotdash commented Feb 7, 2016

Do you have any numbers for this? Also, I seem to recall a huge discussion the last time the prefixes were to be dropped. Has that been resolved or is this a different change?

@bluss
Copy link
Member

bluss commented Feb 7, 2016

We should put in a release note if the jemalloc prefix changes. If anyone is experimenting with JE_MALLOC_CONF and so on, it's useful to know it has changed.

@alexcrichton
Copy link
Member Author

@dotdash ah yeah sorry I forgot to copy them over into the PR description, but the two commits have:

jemalloc un-prefixing:

Locally this took compile time for libsyntax from 95 seconds to 89 (a 6% improvement).

fnv inline:

This commit made the hashing function disappear from a profiled version of the compiler, but that's likely because it was just inlined elsewhere. When compiling winapi, however, this decreased compile time from 18.3 to 17.8 seconds (a 3% improvement).

@bluss good point!

@alexcrichton alexcrichton added the relnotes Marks issues that should be documented in the release notes of the next release. label Feb 7, 2016
@dotdash
Copy link
Contributor

dotdash commented Feb 7, 2016

@alexcrichton Thanks! You missed the prefix issue though. The old PR was #18678 which, besides the long discussion that led to it not being merged, also had it so that on iOS/OSX and Windows, the prefix is still present (which is the default for jemalloc).

@alexcrichton
Copy link
Member Author

Aha yes, thanks for finding a link to that! I should have hunted that down and thrown that in here.

When that discussion happened we did not have the same support for swapping out allocators that we have today. The RFC for custom allocators was merged long after that discussion, and the early conclusions also pointed towards using jemalloc unprefixed in executables. The major difference between now and #18678 is that we only use jemalloc in executables, no static library or dynamic library in Rust uses jemalloc by default (they use the system allocator).

As a result, due to the change in landscape, many of the concerns raised in #18678 have since been solved. One of the major motivational factors for RFC 1183 was indeed to land this change, I just forgot to get around to it until now!

@alexcrichton
Copy link
Member Author

r? @nikomatsakis

@brson
Copy link
Contributor

brson commented Feb 9, 2016

@bors r+

@bors
Copy link
Contributor

bors commented Feb 9, 2016

📌 Commit 0e5c112 has been approved by brson

@bors
Copy link
Contributor

bors commented Feb 10, 2016

⌛ Testing commit 0e5c112 with merge 9b24f54...

@bors
Copy link
Contributor

bors commented Feb 10, 2016

💔 Test failed - auto-mac-64-opt

@alexcrichton alexcrichton deleted the supafast-rustc branch February 10, 2016 17:30
@alexcrichton alexcrichton restored the supafast-rustc branch February 10, 2016 17:33
@alexcrichton alexcrichton reopened this Feb 10, 2016
@alexcrichton
Copy link
Member Author

Oops!

@dotdash
Copy link
Contributor

dotdash commented Feb 10, 2016

@bors r-

Seems that bors was a bit too happy to see this restored

@alexcrichton
Copy link
Member Author

@bors: r=brson 51e0ee8

ok, let's try that again

@bors
Copy link
Contributor

bors commented Feb 11, 2016

⌛ Testing commit 51e0ee8 with merge 59bff40...

@bors
Copy link
Contributor

bors commented Feb 11, 2016

💔 Test failed - auto-linux-musl-64-opt

@alexcrichton
Copy link
Member Author

Looks like the failure on musl was legitimate, and it happened because:

  1. Both liballoc_jemalloc and liblibc defined a function called malloc
  2. When creating a binary with LTO, both rlibs were linked with -Wl,--whole-archive

This meant the linker found itself trying to include the same symbol twice, and rightly exited with an error.

I believe that the compiler here is erroneous in passing -Wl,--whole-archive when we're linking an executable (that was just intended for a dylib), so I've added a commit to not do that (with a test added as well).

@alexcrichton
Copy link
Member Author

ah and re-r? @brson

@rust-highfive rust-highfive assigned brson and unassigned nikomatsakis Feb 11, 2016
@brson
Copy link
Contributor

brson commented Feb 11, 2016

@bors r+

@bors
Copy link
Contributor

bors commented Feb 11, 2016

📌 Commit e93b127 has been approved by brson

@alexcrichton
Copy link
Member Author

@bors: r=brson 00bf071

@bors
Copy link
Contributor

bors commented Feb 12, 2016

⌛ Testing commit 00bf071 with merge 5978690...

@alexcrichton
Copy link
Member Author

@bors: retry force

@bors
Copy link
Contributor

bors commented Feb 12, 2016

⌛ Testing commit 00bf071 with merge e733a31...

@bors
Copy link
Contributor

bors commented Feb 12, 2016

💔 Test failed - auto-mac-64-nopt-t

@alexcrichton
Copy link
Member Author

@bors: retry

On Thu, Feb 11, 2016 at 9:53 PM, bors notifications@github.com wrote:

[image: 💔] Test failed - auto-mac-64-nopt-t
http://buildbot.rust-lang.org/builders/auto-mac-64-nopt-t/builds/8050


Reply to this email directly or view it on GitHub
#31460 (comment).

@bors
Copy link
Contributor

bors commented Feb 12, 2016

⌛ Testing commit 00bf071 with merge 37ab094...

@bors
Copy link
Contributor

bors commented Feb 12, 2016

💔 Test failed - auto-linux-64-x-android-t

@alexcrichton
Copy link
Member Author

@bors: retry

On Fri, Feb 12, 2016 at 4:51 AM, bors notifications@github.com wrote:

[image: 💔] Test failed - auto-linux-64-x-android-t
http://buildbot.rust-lang.org/builders/auto-linux-64-x-android-t/builds/8006


Reply to this email directly or view it on GitHub
#31460 (comment).

@bors
Copy link
Contributor

bors commented Feb 13, 2016

⌛ Testing commit 00bf071 with merge 27e3610...

@bors
Copy link
Contributor

bors commented Feb 13, 2016

💔 Test failed - auto-linux-64-x-android-t

@alexcrichton
Copy link
Member Author

Whelp, shoulda known that any segfault from unprefixing jemalloc is indicative of mixing allocators. Looks like on Android allocator replacement in libc somehow doesn't work. We call realpath with a NULL second parameter to indicate that it should return to us a buffer allocated with malloc. We then later call libc::free on the returned buffer. Presumably libc use the libc malloc, and we use the jemalloc free, hence badness!

I'll just re-prefix the symbols on android for now.

Right now the primary hashing algorithm of the compiler isn't actually inlined
across crates, meaning that it may be missing out on some crucial optimizations
in a few places (perhaps unrolling smaller loops, etc).

This commit made the hashing function disappear from a profiled version of the
compiler, but that's likely because it was just inlined elsewhere. When
compiling winapi, however, this decreased compile time from 18.3 to 17.8 seconds
(a 3% improvement).
Back in 9bc8e6d the linking of rlibs changed to using the `link_whole_rlib`
function. This change, however was only intended to affect dylibs, not
executables. For executables we don't actually want to link entire rlibs because
we want the linker to strip out as much as possible.

This commit adds a conditional to this logic to only link entire rlibs if we're
creating a dylib, and otherwise an executable just links an rlib as usual. A
test is included which will fail to link if this behavior is reverted.
Now that we properly only link in jemalloc when building executables, we have
far less to worry about in terms of polluting the global namespace with the
`free` and `malloc` symbols on Linux. This commit will primarily allow LLVM to
use jemalloc so the compiler will only be using one allocator overall.

Locally this took compile time for libsyntax from 95 seconds to 89 (a 6%
improvement).
@alexcrichton
Copy link
Member Author

@bors: r=brson e3b414d

@bors
Copy link
Contributor

bors commented Feb 14, 2016

⌛ Testing commit e3b414d with merge a888333...

bors added a commit that referenced this pull request Feb 14, 2016
Looking at some profiles of rustc recently, these seemed like two pretty easy wins we could get in terms of performance on the table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants