Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provenance preparation for Pulley #10043

Merged
merged 8 commits into from
Jan 23, 2025

Conversation

alexcrichton
Copy link
Member

This commit is an internal refactoring of Wasmtime's runtime to prepare
to execute Pulley in MIRI. Currently today this is not possible because
Pulley does not properly respect either strict or permissive provenance
models. The goal of this refactoring is to enable fixing this in a
future commit that doesn't touch everything in the codebase. Instead
everything is touched here in this commit.

The basic problem with Pulley is that it is incompatible with the strict
provenance model of Rust which means that we'll be using "exposed
provenance" APIs to satisfy Rust's soundness requirements. In this model
we must explicitly call ptr.expose_provenance() on any pointers
which are exposed to compiled code. Arguably we should also be already
doing this for natively-compiled code but I am not certain about how
strictly this is required.

Currently in Wasmtime today we call ptr.expose_provenance() nowhere.
It also turns out, though, that we share quite a few pointers in quite a
few places with compiled code. This creates a bit of a problem! The
solution settled on in this commit looks like:

  • A new marker trait, VmSafe, is introduced. This trait is used to
    represent "safe to share with compiled code" types and enumerates some
    properties such as defined ABIs, primitives wrappers match primitive
    ABIs, and notably "does not contain a pointer".

  • A new type, VmPtr<T>, is added to represent pointers shared with
    compiled code. Internally for now this is just SendSyncPtr<T> but in
    the future it will be usize. By using SendSyncPtr<T> it shouldn't
    actually really change anything today other than requiring a lot of
    refactoring to get the types to line up.

  • The core vmctx_plus_offset* methods are updated to require
    T: VmSafe. Previously they allowed any T which is relatively
    dangerous to store any possible Rust type in Cranelift-accessible
    areas.

These three fundamental changes were introduced in this commit. All
further changes were refactoring necessary to get everything working
after these changes. For example many types in vmcontext.rs, such as
VMFuncRef, have changed to using VmPtr<T> instead of NonNull<T> or
*mut T. This is a pretty expansive change which resulted in touching a
lot of places.

One premise of VmPtr<T> is that it's non-null. This was an
additional refactoring that updated a lot of places where previously
*mut T was used and now either VmPtr<T> or NonNull<T> is used.

In the end the intention is that VmPtr<T> is used whenever pointers
are store in memory that can be accessed from Cranelift. When operating
inside of the runtime NonNull<T> or SendSyncPtr<T> is preferred
instead.

As a final note, no provenance changes have actually happened yet. For
example this doesn't fix Pulley in MIRI. What it does enable, though, is
that the future commit to fix Pulley in MIRI will be much smaller with
this having already landed.

@alexcrichton alexcrichton requested a review from a team as a code owner January 17, 2025 22:36
@alexcrichton alexcrichton requested review from dicej and removed request for a team January 17, 2025 22:36
@alexcrichton
Copy link
Member Author

Note that this is temporarily based on #10040 which accounts for the first few commits.

@cfallin
Copy link
Member

cfallin commented Jan 17, 2025

The basic problem with Pulley is that it is incompatible with the strict
provenance model of Rust

I'll link here that we have at least one idea (#9060) that would make Pulley compatible with strict provenance, albeit as a major refactor to the way that Wasmtime's data structures work. No argument that with the design today it is basically incompatible with strict provenance.

@alexcrichton
Copy link
Member Author

Oh true good point! I'm a bit skeptical of the feasibility of such an implementation route though...

One thing I've also been wondering about though, which is a bit related, is let's ignore Pulley and what's our provenance story for Cranelift-compiled x64 code? For example is it possible in Rust's model to still have strict provenance when JIT code is in the mix? Are we required regardless to use "expose provenance" as a primitive to say "modifications to this memory are happening behind you're back". I'm not sure myself.

@cfallin
Copy link
Member

cfallin commented Jan 17, 2025

It seems to me at least (with my compiler optimizer hat on) that separate compilation is the ultimate "provenance black hole" -- calling JIT code is no different from, say, calling a C library, which could legally return any pointer derived from anything reachable that we pass into it. As long as Rust supports linking to code that is not compiled by the same rustc invocation (i.e., does not adopt a fully closed-world assumption), its provenance model has to support the "outside world" escape hatch that calling JIT code would also fall into. And unless strict provenance is going to say "you are no longer allowed to link C libraries, and all indirect calls have to be statically devirtualizable" that's as far as an analysis is going to get.

@bjorn3
Copy link
Contributor

bjorn3 commented Jan 17, 2025

For example is it possible in Rust's model to still have strict provenance when JIT code is in the mix? Are we required regardless to use "expose provenance" as a primitive to say "modifications to this memory are happening behind you're back". I'm not sure myself.

Calling jitted code is no different from FFI. The foreign code may only access pointers for which it could have received provenance (either through exposed provenance or through a pointer that was passed in) Unluke with Pulley the compiler doesn't have visibility into what the foreign code does, so it for exanple doesn't know if you are loading an integer (erasing provenance) or loading a pointer (preserving provenance if it exists) or which of the available provenances was used for the memory operation. As such it has to assume that the foreign code picked the right provenance to avoid UB if the right provenance could have been picked in the first place due to being exposed to the foreign code.

Edit: What @cfallin said.

@alexcrichton
Copy link
Member Author

Ok that makes sense to me yeah. If this becomes a problem in the future we can perhaps have a different implementation of VmPtr depending on if pulley is turned on or not. Without pulley we'd require strict provenance but with pulley we'd require only permissive provenance.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels Jan 18, 2025
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

This commit is an internal refactoring of Wasmtime's runtime to prepare
to execute Pulley in MIRI. Currently today this is not possible because
Pulley does not properly respect either strict or permissive provenance
models. The goal of this refactoring is to enable fixing this in a
future commit that doesn't touch everything in the codebase. Instead
everything is touched here in this commit.

The basic problem with Pulley is that it is incompatible with the strict
provenance model of Rust which means that we'll be using "exposed
provenance" APIs to satisfy Rust's soundness requirements. In this model
we must explicitly call `ptr.expose_provenance()` on any pointers
which are exposed to compiled code. Arguably we should also be already
doing this for natively-compiled code but I am not certain about how
strictly this is required.

Currently in Wasmtime today we call `ptr.expose_provenance()` nowhere.
It also turns out, though, that we share quite a few pointers in quite a
few places with compiled code. This creates a bit of a problem! The
solution settled on in this commit looks like:

* A new marker trait, `VmSafe`, is introduced. This trait is used to
  represent "safe to share with compiled code" types and enumerates some
  properties such as defined ABIs, primitives wrappers match primitive
  ABIs, and notably "does not contain a pointer".

* A new type, `VmPtr<T>`, is added to represent pointers shared with
  compiled code. Internally for now this is just `SendSyncPtr<T>` but in
  the future it will be `usize`. By using `SendSyncPtr<T>` it shouldn't
  actually really change anything today other than requiring a lot of
  refactoring to get the types to line up.

* The core `vmctx_plus_offset*` methods are updated to require
  `T: VmSafe`. Previously they allowed any `T` which is relatively
  dangerous to store any possible Rust type in Cranelift-accessible
  areas.

These three fundamental changes were introduced in this commit. All
further changes were refactoring necessary to get everything working
after these changes. For example many types in `vmcontext.rs`, such as
`VMFuncRef`, have changed to using `VmPtr<T>` instead of `NonNull<T>` or
`*mut T`. This is a pretty expansive change which resulted in touching a
lot of places.

One premise of `VmPtr<T>` is that it's non-null. This was an
additional refactoring that updated a lot of places where previously
`*mut T` was used and now either `VmPtr<T>` or `NonNull<T>` is used.

In the end the intention is that `VmPtr<T>` is used whenever pointers
are store in memory that can be accessed from Cranelift. When operating
inside of the runtime `NonNull<T>` or `SendSyncPtr<T>` is preferred
instead.

As a final note, no provenance changes have actually happened yet. For
example this doesn't fix Pulley in MIRI. What it does enable, though, is
that the future commit to fix Pulley in MIRI will be much smaller with
this having already landed.
Use `NonNull` or `*mut u8` as appropriate for function signatures
instead. It shouldn't be required to use `VmPtr` during the handoff to
compiled code as we've already annotated the pointer going out.
@alexcrichton
Copy link
Member Author

Ok this is now rebased and I believe should be ready for review @dicej

@alexcrichton
Copy link
Member Author

(I can also tag someone else in if you'd prefer too)

@dicej
Copy link
Contributor

dicej commented Jan 23, 2025

(I can also tag someone else in if you'd prefer too)

I think that would be a good idea. I don't have any experience with the Pulley code so far and don't feel qualified to review something like this -- at least not yet.

@dicej
Copy link
Contributor

dicej commented Jan 23, 2025

...although browsing the code, maybe there's not much here that's Pulley-specific? In which case I could give it a shot.

@alexcrichton alexcrichton requested review from fitzgen and removed request for dicej January 23, 2025 16:43
@alexcrichton
Copy link
Member Author

Happy to go either way yeah, it's true yeah that while the main motivation here is Pulley that's mostly for a subsequent PR running miri and pulley stuff. This PR itself is mostly just refactoring vm internals towards a VmPtr<T> type and a VmSafe trait. I've tagged @fitzgen here but if you'd like to review I'm sure Nick won't mind heh

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo a couple nitpicks

crates/wasmtime/src/runtime/vm/provenance.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/runtime/vm/provenance.rs Outdated Show resolved Hide resolved
@@ -1060,13 +1060,14 @@ impl Func {
let mut store = store.as_context_mut();
let data = &store.0.store_data()[self.0];
let func_ref = data.export().func_ref;
let params_and_returns = NonNull::new(params_and_returns).unwrap_or(NonNull::from(&mut []));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking: is this sound? Do we know for a fact (e.g. based on Rust language sematnics) that the &mut [] pointer will be valid outside of this statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if it is sound, I assume that's only true because it's an empty array literal and thus has a 'static lifetime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge, which is limited, this is sound even if the compiler doesn't use a 'static lifetime here. My understanding is that there's no soundness issue having a pointer to something that's deallocated, for example malloc'ing some data, saving the pointer, and then freeing the data. So even if the array here isn't 'static it's safe to have a pointer just pointing to some random address on the stack.

The other reason I think this is safe is because this is creating NonNull<[T]> which not only has a pointer but also a length. For &mut [] the length is always zero, and that means that the pointer, invalid though it may be, is never accessed.

So all in all I think it's safe because the pointer is never accessed due to the zero length on the pointer, regardless of whether rustc extends this to 'static or not

@alexcrichton alexcrichton added this pull request to the merge queue Jan 23, 2025
Merged via the queue into bytecodealliance:main with commit b86b968 Jan 23, 2025
151 checks passed
@alexcrichton alexcrichton deleted the vm-safe branch January 23, 2025 21:16
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jan 23, 2025
This commit adds a test to CI and a script locally to execute which will
run an entire wasm module under Pulley. The goal of this commit is to
add The Test for miri execution of wasm. In general miri is too slow to
run for the full test suite and even for this single test it takes a
very long time to compile the one small module here. To help with this
the module is precompiled on native for Pulley and then deserialized in
miri itself, meaning that we skip miri execution of Cranelift entirely.

The goal of this commit is to eventually expand this test to cover lots
of little and basic operations of wasm which touch VM state. For now
it's just a simple smoke test that doesn't run much but it will be
expanded over time. Making it much larger than now already turns up miri
violations so I wanted to land an initial scaffold first before
expanding later.

Getting this test to pass requires changing the `VmPtr<T>` introduced
in bytecodealliance#10043 to use a `NonZeroUsize` internally rather than `NonNull<T>`.
This is because Pulley is only compatible with exposed provenance which
means we need to actually expose the provenance of pointers.

Both Pulley and Wasmtime need to deal with exposed provenance APIs, but
such APIs are not available in Wasmtime's current MSRV of 1.82. These
APIs were instead introduced as stable in Rust 1.84. In lieu of waiting
a few months because I'm impatient I've added a small build script to
both crates to detect the rustc version and see whether provenance APIs
are available. These build script modifications will no longer be
necessary once our MSRV is 1.84+.

prtest:miri
github-merge-queue bot pushed a commit that referenced this pull request Jan 23, 2025
* pulley: Execute a wasm module under miri

This commit adds a test to CI and a script locally to execute which will
run an entire wasm module under Pulley. The goal of this commit is to
add The Test for miri execution of wasm. In general miri is too slow to
run for the full test suite and even for this single test it takes a
very long time to compile the one small module here. To help with this
the module is precompiled on native for Pulley and then deserialized in
miri itself, meaning that we skip miri execution of Cranelift entirely.

The goal of this commit is to eventually expand this test to cover lots
of little and basic operations of wasm which touch VM state. For now
it's just a simple smoke test that doesn't run much but it will be
expanded over time. Making it much larger than now already turns up miri
violations so I wanted to land an initial scaffold first before
expanding later.

Getting this test to pass requires changing the `VmPtr<T>` introduced
in #10043 to use a `NonZeroUsize` internally rather than `NonNull<T>`.
This is because Pulley is only compatible with exposed provenance which
means we need to actually expose the provenance of pointers.

Both Pulley and Wasmtime need to deal with exposed provenance APIs, but
such APIs are not available in Wasmtime's current MSRV of 1.82. These
APIs were instead introduced as stable in Rust 1.84. In lieu of waiting
a few months because I'm impatient I've added a small build script to
both crates to detect the rustc version and see whether provenance APIs
are available. These build script modifications will no longer be
necessary once our MSRV is 1.84+.

prtest:miri

* Rejigger the CI matrix

* Don't hardcode toolchain in script
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jan 24, 2025
An additional extension of bytecodealliance#10043 to migrate components to `VmSafe` as
well.
github-merge-queue bot pushed a commit that referenced this pull request Jan 24, 2025
An additional extension of #10043 to migrate components to `VmSafe` as
well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants