Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign interface type value representation #4198

Merged

Conversation

alexcrichton
Copy link
Member

Prior to this PR a major feature of calling component exports (#4039)
was the usage of the Value<T> type. This type represents a value
stored in wasm linear memory (the type T stored there). This
implementation had a number of drawbacks though:

  • When returning a value it's ABI-specific whether you use T or
    Value<T> as a return value. If T is represented with one wasm
    primitive then you have to return T, otherwise the return value must
    be Value<T>. This is somewhat non-obvious and leaks ABI-details into
    the API which is unfortunate.

  • The T in Value<T> was somewhat non-obvious. For example a
    wasm-owned string was Value<String>. Using Value<&str> didn't
    work.

  • Working with Value<T> was unergonomic in the sense that you had to
    first "pair" it with a &Store<U> to get a Cursor<T> and then you
    could start reading the value.

  • Custom structs and enums, while not implemented yet, were planned to
    be quite wonky where when you had Cursor<MyStruct> then you would
    have to import a CursorMyStructExt trait generated by a proc-macro
    (think a #[derive] on the definition of MyStruct) which would
    enable field accessors, returning cursors of all the fields.

  • In general there was no "generic way" to load a T from memory. Other
    operations like lift/lower/store all had methods in the
    ComponentValue trait but load had no equivalent.

None of these drawbacks were deal-breakers per-se. When I started
to implement imported functions, though, the Value<T> type no longer
worked. The major difference between imports and exports is that when
receiving values from wasm an export returns at most one wasm primitive
where an import can yield (through arguments) up to 16 wasm primitives.
This means that if an export returned a string it would always be
Value<String> but if an import took a string as an argument there was
actually no way to represent this with Value<String> since the value
wasn't actually stored in memory but rather the pointer/length pair is
received as arguments. Overall this meant that Value<T> couldn't be
used for arguments-to-imports, which means that altogether something new
would be required.

This PR completely removes the Value<T> and Cursor<T> type in favor
of a different implementation. The inspiration from this comes from the
fact that all primitives can be both lifted and lowered into wasm while
it's just some times which can only go one direction. For example
String can be lowered into wasm but can't be lifted from wasm. Instead
some sort of "view" into wasm needs to be created during lifting.

One of the realizations from #4039 was that we could leverage
run-time-type-checking to reject static constructions that don't make
sense. For example if an embedder asserts that a wasm function returns a
Rust String we can reject that at typechecking time because it's
impossible for a wasm module to ever do that.

The new system of imports/exports in this PR now looks like:

  • Type-checking takes into accont an Op operation which indicates
    whether we'll be lifting or lowering the type. This means that we can
    allow the lowering operation for String but disallow the lifting
    operation. While we can't statically rule out an embedder saying that
    a component returns a String we can now reject it at runtime and
    disallow it from being called.

  • The ComponentValue trait now sports a new load function. This
    function will load and instance of Self from the byte-array
    provided. This is implemented for all types but only ever actually
    executed when the lift operation is allowed during type-checking.

  • The Lift associated type is removed since it's now expected that the
    lift operation returns Self.

  • The ComponentReturn trait is now no longer necessary and is removed.
    Instead returns are bounded by ComponentValue. During type-checking
    it's required that the return value can be lifted, disallowing, for
    example, returning a String or &str.

  • With Value gone there's no need to specify the ABI details of the
    return value, or whether it's communicated through memory or not. This
    means that handling return values through memory is transparently
    handled by Wasmtime.

  • Validation is in a sense more eagerly performed now. Whenever a value
    T is loaded the entire immediate structure of T is loaded and
    validated. Note that recursive through memory validation still does
    not happen, so the contents of lists or strings aren't validated, it's
    just validated that the pointers are in-bounds.

Overall this felt like a much clearer system to work with and should be
much easier to integrate with imported functions as well. The new
WasmStr and WasmList<T> types can be used in import arguments and
lifted from the immediate arguments provided rather than forcing them to
always be stored in memory.

@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label May 31, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Prior to this PR a major feature of calling component exports (bytecodealliance#4039)
was the usage of the `Value<T>` type. This type represents a value
stored in wasm linear memory (the type `T` stored there). This
implementation had a number of drawbacks though:

* When returning a value it's ABI-specific whether you use `T` or
  `Value<T>` as a return value. If `T` is represented with one wasm
  primitive then you have to return `T`, otherwise the return value must
  be `Value<T>`. This is somewhat non-obvious and leaks ABI-details into
  the API which is unfortunate.

* The `T` in `Value<T>` was somewhat non-obvious. For example a
  wasm-owned string was `Value<String>`. Using `Value<&str>` didn't
  work.

* Working with `Value<T>` was unergonomic in the sense that you had to
  first "pair" it with a `&Store<U>` to get a `Cursor<T>` and then you
  could start reading the value.

* Custom structs and enums, while not implemented yet, were planned to
  be quite wonky where when you had `Cursor<MyStruct>` then you would
  have to import a `CursorMyStructExt` trait generated by a proc-macro
  (think a `#[derive]` on the definition of `MyStruct`) which would
  enable field accessors, returning cursors of all the fields.

* In general there was no "generic way" to load a `T` from memory. Other
  operations like lift/lower/store all had methods in the
  `ComponentValue` trait but load had no equivalent.

None of these drawbacks were deal-breakers per-se. When I started
to implement imported functions, though, the `Value<T>` type no longer
worked. The major difference between imports and exports is that when
receiving values from wasm an export returns at most one wasm primitive
where an import can yield (through arguments) up to 16 wasm primitives.
This means that if an export returned a string it would always be
`Value<String>` but if an import took a string as an argument there was
actually no way to represent this with `Value<String>` since the value
wasn't actually stored in memory but rather the pointer/length pair is
received as arguments. Overall this meant that `Value<T>` couldn't be
used for arguments-to-imports, which means that altogether something new
would be required.

This PR completely removes the `Value<T>` and `Cursor<T>` type in favor
of a different implementation. The inspiration from this comes from the
fact that all primitives can be both lifted and lowered into wasm while
it's just some times which can only go one direction. For example
`String` can be lowered into wasm but can't be lifted from wasm. Instead
some sort of "view" into wasm needs to be created during lifting.

One of the realizations from bytecodealliance#4039 was that we could leverage
run-time-type-checking to reject static constructions that don't make
sense. For example if an embedder asserts that a wasm function returns a
Rust `String` we can reject that at typechecking time because it's
impossible for a wasm module to ever do that.

The new system of imports/exports in this PR now looks like:

* Type-checking takes into accont an `Op` operation which indicates
  whether we'll be lifting or lowering the type. This means that we can
  allow the lowering operation for `String` but disallow the lifting
  operation. While we can't statically rule out an embedder saying that
  a component returns a `String` we can now reject it at runtime and
  disallow it from being called.

* The `ComponentValue` trait now sports a new `load` function. This
  function will load and instance of `Self` from the byte-array
  provided. This is implemented for all types but only ever actually
  executed when the `lift` operation is allowed during type-checking.

* The `Lift` associated type is removed since it's now expected that the
  lift operation returns `Self`.

* The `ComponentReturn` trait is now no longer necessary and is removed.
  Instead returns are bounded by `ComponentValue`. During type-checking
  it's required that the return value can be lifted, disallowing, for
  example, returning a `String` or `&str`.

* With `Value` gone there's no need to specify the ABI details of the
  return value, or whether it's communicated through memory or not. This
  means that handling return values through memory is transparently
  handled by Wasmtime.

* Validation is in a sense more eagerly performed now. Whenever a value
  `T` is loaded the entire immediate structure of `T` is loaded and
  validated. Note that recursive through memory validation still does
  not happen, so the contents of lists or strings aren't validated, it's
  just validated that the pointers are in-bounds.

Overall this felt like a much clearer system to work with and should be
much easier to integrate with imported functions as well. The new
`WasmStr` and `WasmList<T>` types can be used in import arguments and
lifted from the immediate arguments provided rather than forcing them to
always be stored in memory.
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@alexcrichton alexcrichton merged commit d5ce51e into bytecodealliance:main Jun 1, 2022
@alexcrichton alexcrichton deleted the no-component-value branch June 1, 2022 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants