Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift-wasm: Attach table OOB traps to loads/stores #8171

Merged
merged 1 commit into from
Mar 19, 2024

Conversation

jameysharp
Copy link
Contributor

Currently, every access to a table element does a bounds-check with a conditional branch to a block that explicitly traps.

Instead, when SPECTRE mitigations are enabled, let's change the address computation to return a null pointer for out-of-bounds accesses, and then allow the subsequent load or store to trap.

This is less code in that case since we can reuse instructions we needed anyway.

In addition, when the table has constant size and the element index is a constant and mid-end optimization is enabled, this allows the bounds-check to be constant folded away. Later, #8139 will let us optimize away the select_spectre_guard instruction in this case too.

Once we also implement #8160, tests/disas/typed-funcrefs.wat should be almost as fast as native indirect function calls.

@jameysharp jameysharp requested review from a team as code owners March 18, 2024 20:22
@jameysharp jameysharp requested review from fitzgen and removed request for a team March 18, 2024 20:22
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:wasm labels Mar 18, 2024
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

In addition to the question I put below, could you update prepare_table_addr to thread out some information "the pointer is known to not be null" or something like that? For example if spectre mitigations are disabled it's known that the load is in-bounds and therefore none of the trap metadata would be required.

Or perhaps prepare_table_addr could return a MemFlags which has the trap code pre-configured?

Comment on lines -68 to -72
let spectre_oob_cmp = if enable_table_access_spectre_mitigation {
Some((index, bound))
} else {
None
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK spectre mitigations are typically a flag due to their performance impact, but in this case because the spectre-related code is straight-line and actually involves less branches, is this actually the more performant path? Would it be worth, for example, unconditionally using select_spectre_guard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually totally confused myself with the same question at first! But I ran it by @cfallin who confirmed the following: A bounds-check branch is almost always predicted correctly, and that means that if we don't care about Spectre, an out-of-order CPU can make more progress by speculating the load and continuing on to dependent operations past that, rather than stalling on the bounds-check computation.

So although I haven't measured it, there should generally be a performance advantage to sticking with the trapnz approach, for anyone who ain't afraid of no Spectre.

@jameysharp
Copy link
Contributor Author

Returning the memflags is an interesting option! Let me think about that a bit.

Currently, every access to a table element does a bounds-check with a
conditional branch to a block that explicitly traps.

Instead, when Spectre mitigations are enabled, let's change the address
computation to return a null pointer for out-of-bounds accesses, and
then allow the subsequent load or store to trap.

This is less code in that case since we can reuse instructions we needed
anyway.

We return the MemFlags that the memory access should use, in addition to
the address it should access. That way we don't record trap metadata on
memory access instructions which can't actually trap due to being
preceded by a `trapnz`-based bounds check, when Spectre mitigations are
disabled.

In addition, when the table has constant size and the element index is a
constant and mid-end optimization is enabled, this allows the
bounds-check to be constant folded away. Later, bytecodealliance#8139 will let us
optimize away the select_spectre_guard instruction in this case too.

Once we also implement bytecodealliance#8160, `tests/disas/typed-funcrefs.wat` should be
almost as fast as native indirect function calls.
@jameysharp
Copy link
Contributor Author

Yeah, returning MemFlags from prepare_table_addr tidied things up nicely. Thanks for the suggestion, Alex!

@jameysharp jameysharp enabled auto-merge March 19, 2024 02:24
@jameysharp jameysharp added this pull request to the merge queue Mar 19, 2024
Merged via the queue into bytecodealliance:main with commit 310e97e Mar 19, 2024
19 checks passed
@jameysharp jameysharp deleted the table-traps branch March 19, 2024 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:wasm cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants