Redesign interface type value representation #4198

alexcrichton · 2022-05-31T15:39:06Z

Prior to this PR a major feature of calling component exports (#4039)
was the usage of the Value<T> type. This type represents a value
stored in wasm linear memory (the type T stored there). This
implementation had a number of drawbacks though:

When returning a value it's ABI-specific whether you use T or
Value<T> as a return value. If T is represented with one wasm
primitive then you have to return T, otherwise the return value must
be Value<T>. This is somewhat non-obvious and leaks ABI-details into
the API which is unfortunate.
The T in Value<T> was somewhat non-obvious. For example a
wasm-owned string was Value<String>. Using Value<&str> didn't
work.
Working with Value<T> was unergonomic in the sense that you had to
first "pair" it with a &Store<U> to get a Cursor<T> and then you
could start reading the value.
Custom structs and enums, while not implemented yet, were planned to
be quite wonky where when you had Cursor<MyStruct> then you would
have to import a CursorMyStructExt trait generated by a proc-macro
(think a #[derive] on the definition of MyStruct) which would
enable field accessors, returning cursors of all the fields.
In general there was no "generic way" to load a T from memory. Other
operations like lift/lower/store all had methods in the
ComponentValue trait but load had no equivalent.

None of these drawbacks were deal-breakers per-se. When I started
to implement imported functions, though, the Value<T> type no longer
worked. The major difference between imports and exports is that when
receiving values from wasm an export returns at most one wasm primitive
where an import can yield (through arguments) up to 16 wasm primitives.
This means that if an export returned a string it would always be
Value<String> but if an import took a string as an argument there was
actually no way to represent this with Value<String> since the value
wasn't actually stored in memory but rather the pointer/length pair is
received as arguments. Overall this meant that Value<T> couldn't be
used for arguments-to-imports, which means that altogether something new
would be required.

This PR completely removes the Value<T> and Cursor<T> type in favor
of a different implementation. The inspiration from this comes from the
fact that all primitives can be both lifted and lowered into wasm while
it's just some times which can only go one direction. For example
String can be lowered into wasm but can't be lifted from wasm. Instead
some sort of "view" into wasm needs to be created during lifting.

One of the realizations from #4039 was that we could leverage
run-time-type-checking to reject static constructions that don't make
sense. For example if an embedder asserts that a wasm function returns a
Rust String we can reject that at typechecking time because it's
impossible for a wasm module to ever do that.

The new system of imports/exports in this PR now looks like:

Type-checking takes into accont an Op operation which indicates
whether we'll be lifting or lowering the type. This means that we can
allow the lowering operation for String but disallow the lifting
operation. While we can't statically rule out an embedder saying that
a component returns a String we can now reject it at runtime and
disallow it from being called.
The ComponentValue trait now sports a new load function. This
function will load and instance of Self from the byte-array
provided. This is implemented for all types but only ever actually
executed when the lift operation is allowed during type-checking.
The Lift associated type is removed since it's now expected that the
lift operation returns Self.
The ComponentReturn trait is now no longer necessary and is removed.
Instead returns are bounded by ComponentValue. During type-checking
it's required that the return value can be lifted, disallowing, for
example, returning a String or &str.
With Value gone there's no need to specify the ABI details of the
return value, or whether it's communicated through memory or not. This
means that handling return values through memory is transparently
handled by Wasmtime.
Validation is in a sense more eagerly performed now. Whenever a value
T is loaded the entire immediate structure of T is loaded and
validated. Note that recursive through memory validation still does
not happen, so the contents of lists or strings aren't validated, it's
just validated that the pointers are in-bounds.

Overall this felt like a much clearer system to work with and should be
much easier to integrate with imported functions as well. The new
WasmStr and WasmList<T> types can be used in import arguments and
lifted from the immediate arguments provided rather than forcing them to
always be stored in memory.

github-actions · 2022-05-31T16:06:38Z

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Prior to this PR a major feature of calling component exports (bytecodealliance#4039) was the usage of the `Value<T>` type. This type represents a value stored in wasm linear memory (the type `T` stored there). This implementation had a number of drawbacks though: * When returning a value it's ABI-specific whether you use `T` or `Value<T>` as a return value. If `T` is represented with one wasm primitive then you have to return `T`, otherwise the return value must be `Value<T>`. This is somewhat non-obvious and leaks ABI-details into the API which is unfortunate. * The `T` in `Value<T>` was somewhat non-obvious. For example a wasm-owned string was `Value<String>`. Using `Value<&str>` didn't work. * Working with `Value<T>` was unergonomic in the sense that you had to first "pair" it with a `&Store<U>` to get a `Cursor<T>` and then you could start reading the value. * Custom structs and enums, while not implemented yet, were planned to be quite wonky where when you had `Cursor<MyStruct>` then you would have to import a `CursorMyStructExt` trait generated by a proc-macro (think a `#[derive]` on the definition of `MyStruct`) which would enable field accessors, returning cursors of all the fields. * In general there was no "generic way" to load a `T` from memory. Other operations like lift/lower/store all had methods in the `ComponentValue` trait but load had no equivalent. None of these drawbacks were deal-breakers per-se. When I started to implement imported functions, though, the `Value<T>` type no longer worked. The major difference between imports and exports is that when receiving values from wasm an export returns at most one wasm primitive where an import can yield (through arguments) up to 16 wasm primitives. This means that if an export returned a string it would always be `Value<String>` but if an import took a string as an argument there was actually no way to represent this with `Value<String>` since the value wasn't actually stored in memory but rather the pointer/length pair is received as arguments. Overall this meant that `Value<T>` couldn't be used for arguments-to-imports, which means that altogether something new would be required. This PR completely removes the `Value<T>` and `Cursor<T>` type in favor of a different implementation. The inspiration from this comes from the fact that all primitives can be both lifted and lowered into wasm while it's just some times which can only go one direction. For example `String` can be lowered into wasm but can't be lifted from wasm. Instead some sort of "view" into wasm needs to be created during lifting. One of the realizations from bytecodealliance#4039 was that we could leverage run-time-type-checking to reject static constructions that don't make sense. For example if an embedder asserts that a wasm function returns a Rust `String` we can reject that at typechecking time because it's impossible for a wasm module to ever do that. The new system of imports/exports in this PR now looks like: * Type-checking takes into accont an `Op` operation which indicates whether we'll be lifting or lowering the type. This means that we can allow the lowering operation for `String` but disallow the lifting operation. While we can't statically rule out an embedder saying that a component returns a `String` we can now reject it at runtime and disallow it from being called. * The `ComponentValue` trait now sports a new `load` function. This function will load and instance of `Self` from the byte-array provided. This is implemented for all types but only ever actually executed when the `lift` operation is allowed during type-checking. * The `Lift` associated type is removed since it's now expected that the lift operation returns `Self`. * The `ComponentReturn` trait is now no longer necessary and is removed. Instead returns are bounded by `ComponentValue`. During type-checking it's required that the return value can be lifted, disallowing, for example, returning a `String` or `&str`. * With `Value` gone there's no need to specify the ABI details of the return value, or whether it's communicated through memory or not. This means that handling return values through memory is transparently handled by Wasmtime. * Validation is in a sense more eagerly performed now. Whenever a value `T` is loaded the entire immediate structure of `T` is loaded and validated. Note that recursive through memory validation still does not happen, so the contents of lists or strings aren't validated, it's just validated that the pointers are in-bounds. Overall this felt like a much clearer system to work with and should be much easier to integrate with imported functions as well. The new `WasmStr` and `WasmList<T>` types can be used in import arguments and lifted from the immediate arguments provided rather than forcing them to always be stored in memory.

fitzgen

Nice!

github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label May 31, 2022

alexcrichton mentioned this pull request May 31, 2022

Tracking issue for implementing the component model #4185

Closed

42 tasks

alexcrichton requested a review from fitzgen June 1, 2022 17:55

alexcrichton force-pushed the no-component-value branch from 42df352 to 49bb3ea Compare June 1, 2022 18:07

fitzgen approved these changes Jun 1, 2022

View reviewed changes

alexcrichton merged commit d5ce51e into bytecodealliance:main Jun 1, 2022

alexcrichton deleted the no-component-value branch June 1, 2022 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign interface type value representation #4198

Redesign interface type value representation #4198

alexcrichton commented May 31, 2022

github-actions bot commented May 31, 2022

fitzgen left a comment

Redesign interface type value representation #4198

Redesign interface type value representation #4198

Conversation

alexcrichton commented May 31, 2022

github-actions bot commented May 31, 2022

Subscribe to Label Action

fitzgen left a comment

Choose a reason for hiding this comment