Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be less aggressive with DroplessArena/TypedArena growth. #71872

Merged
merged 2 commits into from
May 16, 2020

Conversation

nnethercote
Copy link
Contributor

@nnethercote nnethercote commented May 4, 2020

DroplessArena and TypedArena use an aggressive growth strategy: each chunk is double the size of the previous. The first chunk is 4 KiB, the second is 8 KiB, the third is 16 KiB, and so on, indefinitely. (Note that each new chunk is added to the arena; old chunks are kept, and there is no reallocation of chunks occurring.) DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to TypedArena, too.

@rust-highfive
Copy link
Collaborator

r? @LukasKalbertodt

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 4, 2020
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented May 4, 2020

⌛ Trying commit 7a10dbe7be936d200bce07396c30ab8b1d9b8f93 with merge 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e...

@Mark-Simulacrum
Copy link
Member

I wonder if we should increase the first chunk, too - 4 KB seems quite small to me, most crates use much more I imagine? It would be nice to get statistics I guess on how much memory each arena uses in total. In any case, it probably doesn't matter that much, I'm guessing the arena allocations are all large enough that the allocator falls back to mmap or whatever anyway.

One thing that I had wanted to experiment with is using (at least on Linux) huge pages for our arenas with the intent of reducing TLB misses - but I haven't had time to investigate.

Anyway, this is all just throwing thoughts out there, not things that need to happen in this PR.

src/libarena/lib.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented May 4, 2020

☀️ Try build successful - checks-azure
Build commit: 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e (9bcded1310a9c28fe29ec51d42ee18c5a1eb761e)

@rust-timer
Copy link
Collaborator

Queued 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e with parent d6823ba, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 9bcded1310a9c28fe29ec51d42ee18c5a1eb761e, comparison URL.

@nnethercote
Copy link
Contributor Author

I wonder if we should increase the first chunk, too - 4 KB seems quite small to me

I think 4 KiB is a good choice. We don't want smaller than that because it's commonly the page size. As for larger, in my profiles I've seen that it's pretty common to have TypedArenas that are barely used, e.g. only 64 bytes out of 4 KiB are used. I'm going to see if I can avoid some of those today, but in general the design is nicely flexible -- if the arena is barely used you only waste ~4 KiB but if it's used a lot it quickly scales up to 1 or 2 MiB, and then with the changes in this PR we don't get overly large chunks that can waste lots of space.

@nnethercote
Copy link
Contributor Author

The max RSS results are noisy as usual, but I didn't expect an improvement there. The wasted space from the overly-large chunks shouldn't contribute to RSS, because the memory is never touched.

So why bother with this, then? Two reasons:

  • The cycle counts in the perf results look like a slight improvement overall, and maybe likewise with wall-times though it's harder to tell.
  • I'm starting a push on memory usage and these overly-large chunks clog up DHAT's output. Getting rid of them will make it easier for me to make subsequent progress.

@nnethercote nnethercote force-pushed the less-aggressive-arena-growth branch from 7a10dbe to 53f6859 Compare May 4, 2020 22:24
@nnethercote
Copy link
Contributor Author

I have updated to use 2 MiB as the limit.

@the8472
Copy link
Member

the8472 commented May 5, 2020

I'm not sure if this is an issue, but changing from exponential to linear means that the number of memory mappings increase (assuming the underlying malloc does mmap for large allocations) and there's a limit on those (vm.max_map_count = 65530 on my system). That means it lowers the maximum that can be allocated via these arenas to at most 130GB, likely less.
That ought to be enough for anyone, but still might be a noteworthy change.

@nnethercote
Copy link
Contributor Author

Eight days with no review response. Let's try a different reviewer.

r? @oli-obk

src/libarena/lib.rs Outdated Show resolved Hide resolved
@oli-obk
Copy link
Contributor

oli-obk commented May 12, 2020

just some nits, r=me with those resolved

For the given code paths, the amount of space used in the previous chunk
is irrelevant.

(This will almost never make a difference to behaviour, but it makes the
code clearer.)
`DroplessArena` and `TypedArena` use an aggressive growth strategy: the
first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling
indefinitely. DHAT profiles show that sometimes this results in large
chunks (e.g. 16-128 MiB) that are barely filled. Although these don't
contribute to RSS, they clog up the DHAT profiles.

This commit changes things so that the doubling stops at 2 MiB. This is
large enough that chunk allocations are still rare (you might get 100s
instead of 10s of them) but avoids lots of unused space in the worst
case. It gives a slight speed-up to cycle counts in some cases.
@nnethercote nnethercote force-pushed the less-aggressive-arena-growth branch from 53f6859 to 40d4868 Compare May 13, 2020 02:33
@nnethercote
Copy link
Contributor Author

@bors r=oli-obk

@bors
Copy link
Contributor

bors commented May 13, 2020

📌 Commit 40d4868 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 13, 2020
RalfJung added a commit to RalfJung/rust that referenced this pull request May 14, 2020
…rowth, r=oli-obk

Be less aggressive with `DroplessArena`/`TypedArena` growth.

`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 14, 2020
…rowth, r=oli-obk

Be less aggressive with `DroplessArena`/`TypedArena` growth.

`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
RalfJung added a commit to RalfJung/rust that referenced this pull request May 15, 2020
…rowth, r=oli-obk

Be less aggressive with `DroplessArena`/`TypedArena` growth.

`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
RalfJung added a commit to RalfJung/rust that referenced this pull request May 15, 2020
…rowth, r=oli-obk

Be less aggressive with `DroplessArena`/`TypedArena` growth.

`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
@RalfJung
Copy link
Member

(I am going to assume that this is deliberately not rollup=never, and include this in rollups. If that is undesired, please mark PR as rollup=never.)

@nnethercote
Copy link
Contributor Author

Correct, the perf effect is small enough that this would be fine in a rollup.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 15, 2020
…rowth, r=oli-obk

Be less aggressive with `DroplessArena`/`TypedArena` growth.

`DroplessArena` and `TypedArena` use an aggressive growth strategy: the first chunk is 4 KiB, the second is 8 KiB, and it keeps on doubling indefinitely. DHAT profiles show that sometimes this results in large chunks (e.g. 16-128 MiB) that are barely filled.

This commit changes things so that the doubling stops at 2 MiB. This is large enough that chunk allocations are still rare (you might get 100s instead of 10s of them) but avoids lots of unused space in the worst case. It makes the same change to `TypedArena`, too.
@RalfJung
Copy link
Member

Something failed both #72226 and #72228, and there are only two PRs left in the intersection: this one, and #72090. The latter is an unlikely candidate.

So could you check those errors and see if they might be caused by this PR?

@CAD97
Copy link
Contributor

CAD97 commented May 15, 2020

My (complete) guess would be that some code is still expecting the capacity to continue doubling after hitting the HUGE_PAGE size.

@nnethercote
Copy link
Contributor Author

So could you check those errors and see if they might be caused by this PR?

It's not obviously the fault of this PR. I'm having trouble imagining how this PR would cause lots of test failures only on Windows builds. This is platform-independent allocator code... if the code was bad, I would expect failures on all platforms.

Can we wait for it to land by itself? That would give a clear signal whether this PR is the problem. (It's a shame we can't schedule a try run on specific platforms in advance of merging.)

@RalfJung
Copy link
Member

It could also be 32bit-only, and the Windows builders are just the first to fail.

@RalfJung
Copy link
Member

Anyway let's make sure to not roll this up until we got confirmation.
@bors rollup=never

@bors
Copy link
Contributor

bors commented May 16, 2020

⌛ Testing commit 40d4868 with merge 31add7e...

@bors
Copy link
Contributor

bors commented May 16, 2020

☀️ Test successful - checks-azure
Approved by: oli-obk
Pushing 31add7e to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 16, 2020
@bors bors merged commit 31add7e into rust-lang:master May 16, 2020
@RalfJung
Copy link
Member

Strange, looks like all the PRs in the intersection of those two rollups have landed now...

@nnethercote nnethercote deleted the less-aggressive-arena-growth branch May 17, 2020 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants