-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be less aggressive with DroplessArena
/TypedArena
growth.
#71872
Be less aggressive with DroplessArena
/TypedArena
growth.
#71872
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 7a10dbe7be936d200bce07396c30ab8b1d9b8f93 with merge 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e... |
I wonder if we should increase the first chunk, too - 4 KB seems quite small to me, most crates use much more I imagine? It would be nice to get statistics I guess on how much memory each arena uses in total. In any case, it probably doesn't matter that much, I'm guessing the arena allocations are all large enough that the allocator falls back to mmap or whatever anyway. One thing that I had wanted to experiment with is using (at least on Linux) huge pages for our arenas with the intent of reducing TLB misses - but I haven't had time to investigate. Anyway, this is all just throwing thoughts out there, not things that need to happen in this PR. |
☀️ Try build successful - checks-azure |
Queued 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e with parent d6823ba, future comparison URL. |
Finished benchmarking try commit 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e, comparison URL. |
I think 4 KiB is a good choice. We don't want smaller than that because it's commonly the page size. As for larger, in my profiles I've seen that it's pretty common to have |
The max RSS results are noisy as usual, but I didn't expect an improvement there. The wasted space from the overly-large chunks shouldn't contribute to RSS, because the memory is never touched. So why bother with this, then? Two reasons:
|
7a10dbe
to
53f6859
Compare
I have updated to use 2 MiB as the limit. |
I'm not sure if this is an issue, but changing from exponential to linear means that the number of memory mappings increase (assuming the underlying malloc does mmap for large allocations) and there's a limit on those ( |
Eight days with no review response. Let's try a different reviewer. r? @oli-obk |
just some nits, r=me with those resolved |
For the given code paths, the amount of space used in the previous chunk is irrelevant. (This will almost never make a difference to behaviour, but it makes the code clearer.)
`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. Although these don't contribute to RSS, they clog up the DHAT profiles. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It gives a slight speed-up to cycle counts in some cases.
53f6859
to
40d4868
Compare
@bors r=oli-obk |
📌 Commit 40d4868 has been approved by |
…rowth, r=oli-obk Be less aggressive with `DroplessArena`/`TypedArena` growth. `DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
…rowth, r=oli-obk Be less aggressive with `DroplessArena`/`TypedArena` growth. `DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
…rowth, r=oli-obk Be less aggressive with `DroplessArena`/`TypedArena` growth. `DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
…rowth, r=oli-obk Be less aggressive with `DroplessArena`/`TypedArena` growth. `DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
(I am going to assume that this is deliberately not rollup=never, and include this in rollups. If that is undesired, please mark PR as rollup=never.) |
Correct, the perf effect is small enough that this would be fine in a rollup. |
…rowth, r=oli-obk Be less aggressive with `DroplessArena`/`TypedArena` growth. `DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled. This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
My (complete) guess would be that some code is still expecting the capacity to continue doubling after hitting the |
It's not obviously the fault of this PR. I'm having trouble imagining how this PR would cause lots of test failures only on Windows builds. This is platform-independent allocator code... if the code was bad, I would expect failures on all platforms. Can we wait for it to land by itself? That would give a clear signal whether this PR is the problem. (It's a shame we can't schedule a try run on specific platforms in advance of merging.) |
It could also be 32bit-only, and the Windows builders are just the first to fail. |
Anyway let's make sure to not roll this up until we got confirmation. |
☀️ Test successful - checks-azure |
Strange, looks like all the PRs in the intersection of those two rollups have landed now... |
DroplessArena
andTypedArena
use an aggressive growth strategy: each chunk is double the size of the previous. The first chunk is 4 KiB, the second is 8 KiB, the third is 16 KiB, and so on, indefinitely. (Note that each new chunk is added to the arena; old chunks are kept, and there is no reallocation of chunks occurring.) DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to
TypedArena
, too.