Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpk: optimize layout of protected stripes #7603

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

abrown
Copy link
Contributor

@abrown abrown commented Nov 29, 2023

While experimenting with the limits of MPK-protected memory pools, @alexcrichton and I discovered that the current slab layout calculations were too conservative. This meant that the memory pool could not pack in as many memories as it should have been able: we were expecting, but not seeing, ~15x more memory slots over non-MPK memory pools.

The fix ends up being simpler than the original: we must maintain the codegen constraints that expect a static memory to be inaccessible for OOB access within a static_memory_maximum_size + static_memory_guard_size region (called expected_slot_bytes + guard_bytes in memory_pool.rs). By dividing up that region between the stripes, we still guarantee that the region is inaccessible by packing in other MPK-protected stripes. And we still need to make sure that the post_slab_guard_bytes add up to that region. These changes fix the memory inefficiency issues we were seeing.

abrown and others added 3 commits November 29, 2023 10:48
While experimenting with the limits of MPK-protected memory pools,
@alexcrichton and I discovered that the current slab layout calculations
were too conservative. This meant that the memory pool could not pack in
as many memories as it should have been able: we were expecting, but not
seeing, ~15x more memory slots over non-MPK memory pools.

The fix ends up being simpler than the original: we must maintain the
codegen constraints that expect a static memory to be inaccessible for
OOB access within a `static_memory_maximum_size +
static_memory_guard_size` region (called `expected_slot_bytes +
guard_bytes` in `memory_pool.rs`). By dividing up that region between
the stripes, we still guarantee that the region is inaccessible by
packing in other MPK-protected stripes. And we still need to make sure
that the `post_slab_guard_bytes` add up to that region. These changes
fix the memory inefficiency issues we were seeing.

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
@alexcrichton pointed out that we know that `slot_bytes /
max_memory_bytes` will at least be 1 due to a `max` comparison above.
Knowing this, we can remove a `+ 1` intended for the case when
`needed_num_stripes == 0`, which should be impossible.
This style change is a readability improvement; no calculations should
change.

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
@abrown abrown marked this pull request as ready for review November 29, 2023 19:25
@abrown abrown requested a review from a team as a code owner November 29, 2023 19:25
@abrown abrown requested review from pchickey and removed request for a team November 29, 2023 19:25
@alexcrichton
Copy link
Member

Looks good to me!

I'd be happy to approve and land this as-is, but we talked about some other somewhat-unrelated things to this PR which I think you're going to include here as well.

I'll also note that I have one point of note which I'll emphasize shouldn't be addressed in this PR because I think it's not worth it. That being said I think that this strategy is a tiny bit wasteful in terms of pages because the slot size is unlikely to be a multiple of the number of stripes meaning something will be rounded up to a page boundary. For example with a 4G memory, a 2G guard, and a 512M max memory size that'll support up to 12 stripes (6G/512M == 12). If however only 11 stripes were available then what should in theory happen is that one 6G slot would have 11 memories of max size 512M packed together with some leftover guard space at the end. That would mean 11 memories would be packed into a 6G region.

With this PR, however, what would happen is that the 6G region would be divided into 11 chunks of size 0x22e8ba2e. That'd then be rounded up to 0x22e8c000 for a single slot, which if that's multiplied back by 11 yields 0x180004000 bytes. This means that with "ideal logic" a 6G region could hold 11 memories but this calculation instead requires a 6G+16K region to hold the 11 memories.

This is a very minor loss, however (6G vs four pages) however and accounting for some guard regions interspersed with all the slots would require deeper refactoring. That's why I want to note this, but I don't think it's worth changing.

@abrown abrown added this pull request to the merge queue Nov 29, 2023
Merged via the queue into bytecodealliance:main with commit 043e4ce Nov 29, 2023
19 checks passed
@abrown abrown deleted the pku-optimize branch November 29, 2023 20:58
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 29, 2023
)"

This reverts commit 043e4ce. The
following fails when running locally on an MPK-enabled machine:

```
WASMTIME_TEST_FORCE_MPK=1 cargo test
```
github-merge-queue bot pushed a commit that referenced this pull request Nov 29, 2023
This reverts commit 043e4ce. The
following fails when running locally on an MPK-enabled machine:

```
WASMTIME_TEST_FORCE_MPK=1 cargo test
```
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 30, 2023
This is another attempt at bytecodealliance#7603, attempting reduce the slab layout
sizes of MPK-protected stripes. While experimenting with the limits of
MPK-protected memory pools, @alexcrichton and I discovered that the
current slab layout calculations were too conservative. This meant that
the memory pool could not pack in as many memories as it should have
been able: we were expecting, but not seeing, ~15x more memory slots
over non-MPK memory pools.

This change brings together several fixes:
- it more aggressively divides up the stripes (as in b212152)
- it eliminates an extra stripe (as in 8813a30)
- it replaces some uses of `checked_*` with `saturating_*` (as in
  fb22a20)
- it improves some documentation
- and, crucially, it reports back a larger value for
  `memory_and_guard_size`

The failures observed with bytecodealliance#7603 when run with MPK
(`WASMTIME_TEST_FORCE_MPK=1 cargo test`) were due to `Store::wasm_fault`
not being able to identify which memory an OOB address belonged to.
This is because the `MemoryPool` was underreporting the size of the
region in which OOB accesses would fault. The correct value is provided
by the new `SlabLayout::bytes_to_next_stripe_slot`: any OOB access
within that larger region must fault because (1) the other stripes have
different protection keys and (2) a `Store` can only use one protection
key. We also use (2) to guarantee that `Store::wasm_fault` will be able
to calculate the Wasm address from the raw address.

This change also provides a new `traps` test that will reproduce the
failures from bytecodealliance#7603; if we observe `SIGABRT` from that test, it will be
a regression.
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 30, 2023
This is another attempt at bytecodealliance#7603, attempting reduce the slab layout
sizes of MPK-protected stripes. While experimenting with the limits of
MPK-protected memory pools, @alexcrichton and I discovered that the
current slab layout calculations were too conservative. This meant that
the memory pool could not pack in as many memories as it should have
been able: we were expecting, but not seeing, ~15x more memory slots
over non-MPK memory pools.

This change brings together several fixes:
- it more aggressively divides up the stripes (as in b212152)
- it eliminates an extra stripe (as in 8813a30)
- it replaces some uses of `checked_*` with `saturating_*` (as in
  fb22a20)
- it improves some documentation
- and, crucially, it reports back a larger value for
  `memory_and_guard_size`

The failures observed with bytecodealliance#7603 when run with MPK
(`WASMTIME_TEST_FORCE_MPK=1 cargo test`) were due to `Store::wasm_fault`
not being able to identify which memory an OOB address belonged to.
This is because the `MemoryPool` was underreporting the size of the
region in which OOB accesses would fault. The correct value is provided
by the new `SlabLayout::bytes_to_next_stripe_slot`: any OOB access
within that larger region must fault because (1) the other stripes have
different protection keys and (2) a `Store` can only use one protection
key. We also use (2) to guarantee that `Store::wasm_fault` will be able
to calculate the Wasm address from the raw address.

This change also provides a new `traps` test that will reproduce the
failures from bytecodealliance#7603; if we observe `SIGABRT` from that test, it will be
a regression.
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 30, 2023
This is another attempt at bytecodealliance#7603, attempting reduce the slab layout
sizes of MPK-protected stripes. While experimenting with the limits of
MPK-protected memory pools, @alexcrichton and I discovered that the
current slab layout calculations were too conservative. This meant that
the memory pool could not pack in as many memories as it should have
been able: we were expecting, but not seeing, ~15x more memory slots
over non-MPK memory pools.

This change brings together several fixes:
- it more aggressively divides up the stripes (as in b212152)
- it eliminates an extra stripe (as in 8813a30)
- it replaces some uses of `checked_*` with `saturating_*` (as in
  fb22a20)
- it improves some documentation
- and, crucially, it reports back a larger value for
  `memory_and_guard_size`

The failures observed with bytecodealliance#7603 when run with MPK
(`WASMTIME_TEST_FORCE_MPK=1 cargo test`) were due to `Store::wasm_fault`
not being able to identify which memory an OOB address belonged to.
This is because the `MemoryPool` was underreporting the size of the
region in which OOB accesses would fault. The correct value is provided
by the new `SlabLayout::bytes_to_next_stripe_slot`: any OOB access
within that larger region must fault because (1) the other stripes have
different protection keys and (2) a `Store` can only use one protection
key. We also use (2) to guarantee that `Store::wasm_fault` will be able
to calculate the Wasm address from the raw address.

This change also provides a new `traps` test that will reproduce the
failures from bytecodealliance#7603; if we observe `SIGABRT` from that test, it will be
a regression.
github-merge-queue bot pushed a commit that referenced this pull request Dec 1, 2023
* mpk: optimize layout of protected stripes, again

This is another attempt at #7603, attempting reduce the slab layout
sizes of MPK-protected stripes. While experimenting with the limits of
MPK-protected memory pools, @alexcrichton and I discovered that the
current slab layout calculations were too conservative. This meant that
the memory pool could not pack in as many memories as it should have
been able: we were expecting, but not seeing, ~15x more memory slots
over non-MPK memory pools.

This change brings together several fixes:
- it more aggressively divides up the stripes (as in b212152)
- it eliminates an extra stripe (as in 8813a30)
- it replaces some uses of `checked_*` with `saturating_*` (as in
  fb22a20)
- it improves some documentation
- and, crucially, it reports back a larger value for
  `memory_and_guard_size`

The failures observed with #7603 when run with MPK
(`WASMTIME_TEST_FORCE_MPK=1 cargo test`) were due to `Store::wasm_fault`
not being able to identify which memory an OOB address belonged to.
This is because the `MemoryPool` was underreporting the size of the
region in which OOB accesses would fault. The correct value is provided
by the new `SlabLayout::bytes_to_next_stripe_slot`: any OOB access
within that larger region must fault because (1) the other stripes have
different protection keys and (2) a `Store` can only use one protection
key. We also use (2) to guarantee that `Store::wasm_fault` will be able
to calculate the Wasm address from the raw address.

This change also provides a new `traps` test that will reproduce the
failures from #7603; if we observe `SIGABRT` from that test, it will be
a regression.

* fix: make test x86-specific
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants