-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing Stacked Borrows (part 2): Shrink Item #2315
Conversation
How is that implementing the correct semantics? Note that with a type like EDIT: Oh I see you have a TODO that probably relates to this. I assume this currently makes a test fail? |
Yes this absolutely implements the wrong semantics right now, and the two tests for internal mutability fail. I'm going to touch up the rest of this first and hope you decide on the behavior of |
Well I am personally pretty much settled that I want them to completely infect the reference. But this seems like a big change and I didn't see a lot of other people (in particular the lang team) enthusiastically agree with that... |
Meanwhile, what about the following scheme: we sacrifice another bit in the item to indicate whether or not this item is protected. That should be enough, because all items for a given tag |
🤷 All that I can find related to this decision is the rfcbot comment which got un-nominated with no evidence of discussion. I haven't seen any objections from any lang team members, and it's only been 5 days since it was un-nominated. Maybe I'm just very patient. I was considering sacrificing another bit anyway to bypass the check in |
Oh 🤦 wait I get it. Yes this is a great idea. |
That's a different discussion. If/when that lands (which I expect it will), that brings the doc in sync with what Miri already does since #2248. What you would have needed is rust-lang/unsafe-code-guidelines#236. |
Now I just have UI test failures because printing a tag can no longer print the CallId that it was protected by. Is that okay? |
Also, should we even have a second |
Only having the packed thing would make the code simpler? Sure, then go for it. (I guess for |
d9c5d39
to
548ca37
Compare
☔ The latest upstream changes (presumably #2333) made this pull request unmergeable. Please resolve the merge conflicts. |
☔ The latest upstream changes (presumably #2338) made this pull request unmergeable. Please resolve the merge conflicts. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@rustbot author |
There we go. :) |
✌️ @saethlin can now approve this pull request |
Previously, Item was a struct of a NonZeroU64, an Option which was usually unset or irrelevant, and a 4-variant enum. So collectively, the size of an Item was 24 bytes, but only 8 bytes were used for the most part. So this takes advantage of the fact that it is probably impossible to exhaust the total space of SbTags, and steals 3 bits from it to pack the whole struct into a single u64. This bit-packing means that we reduce peak memory usage when Miri goes memory-bound by ~3x. We also get CPU performance improvements of varying size, because not only are we simply accessing less memory, we can now compare a Vec<Item> using a memcmp because it does not have any padding.
stacked_borrow now has an item module, and its own FrameExtra. These serve to protect the implementation of Item (which is a bunch of bit-packing tricks) from the primary logic of Stacked Borrows, and the FrameExtra we have separates Stacked Borrows more cleanly from the interpreter itself. The new strategy for checking protectors also makes some subtle performance tradeoffs, so they are now documented in Stack::item_popped because that function primarily benefits from them, and it also touches every aspect of them. Also separating the actual CallId that is protecting a Tag from the Tag makes it inconvienent to reproduce exactly the same protector errors, so this also takes the opportunity to use some slightly cleaner English in those errors. We need to make some change, might as well make it good.
I squashed a bit more than a little, but I think also wrote a fittingly large message for the second commit. |
@bors r+ |
☀️ Test successful - checks-actions |
Add a benchmark of the hang-on-test-failure code path This is the code pattern that produces the performance problem in #2273 I figured out what I was stuck on in #2315 (comment). For a while I was just doing `let x: &[u8] = &[0u8; 4096];` but that doesn't produce the runtime inside `Stack::item_popped` that I was looking for, I think because this allocation is never deallocated. But with `Vec`, I get the profile I'm looking for.
This moves protectors out of
Item
, storing them both in a globalHashSet
which contains all currently-protected tags as well as aVec<SbTag>
on eachFrame
so that when we return from a function we know which tags to remove from the protected set.This also bit-packs the 64-bit tag and the 2-bit permission together when they are stored in memory. This means we theoretically run out of tags sooner, but I doubt that limit will ever be hit.
Together these optimizations reduce the memory footprint of Miri when executing programs which stress Stacked Borrows by ~66%. For example, running a test with isolation off which only panics currently peaks at ~19 GB, with this PR it peaks at ~6.2 GB.
To-do
UnsafeCell
to become infectious, or express offsets + tags in the global protector setBenchmarks before:
After:
The decrease in system time is probably due to spending less time in the page fault handler.