Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement host generators in terms of components #314

Closed
alexcrichton opened this issue Sep 14, 2022 · 8 comments
Closed

Reimplement host generators in terms of components #314

alexcrichton opened this issue Sep 14, 2022 · 8 comments

Comments

@alexcrichton
Copy link
Member

I'm opening this issue to document my thinking as a result of trying to resolve issues like #214, #34, and #201. I mentioned a number of issues with the current wit-bindgen architecture in #214, the biggest of which is that there was no clear location to slot a true "component" into the current hosts due to their centralized concept of using a singular wasm module for everything. To that end the best summary I can come up with is that the host-generators of wit-bindgen (gen-host-wasmtime-{py,rust} and gen-host-js) need to be reimplemented in terms of components, not core wasm modules.

In pivoting to components instead of core wasm modules I also believe that there's going to be to classifications of hosts: one where the host natively supports components and one where the host does not. For example gen-host-wasmtime-rust natively supports components, but gen-host-wasmtime-py (which uses Wasmtime's C API) and gen-host-js do not. This is not just a surface level distinction but will have deep implications on the capability of the code generators, namely hosts which do not have native component support will only support loading a single statically determined component. Hosts which natively support components (I'll just say Wasmtime from now on to refer to the Rust embedding of Wasmtime) will be able to consume any component that has a particular "shape", e.g. the same set of imports/exports with arbitrary structure internally.

Hosts without component support

I mentioned above but the major change for these hosts is that unlike today where they attempt to support any "shape" of core wasm module these hosts will only be able to generate bindings for a single statically known component. The rationale for this is that the imports/exports of a component, which are the most interesting from an embedder perspective, are only a small portion of the structure of a component which can have a lot going on internally within it. I don't think that wit-bindgen wants to get into the business of generating an entire component runtime for JS, for example, so instead these host generators will ingest a component and spit out bindings for just that one component.

At least for JS this lines up with how it was envisioned being used. In JS or web scenarios the intent is that there's a "thing compiled to wasm" that you want to run, so it's a statically known single module and there's no need for it to be able to instantiate multiple different shapes of modules at runtime. The Python generator for a Wasmtime host is mostly just proof-of-concept right now and I at least myself don't have preconcieved notions about how Python is most likely to be used.

The refactoring for the JS host generator (and transtiviely Python) will be that instead of taking --import and --export for *.wit files the JS host generator would instead "simply" take a component as input. At a high level I view the bindings generation process as:

  • A *.wasm file, which is a component, is provided as input
  • The wit-bindgen-gen-host-js crate uses the wasmtime-environ crate, an internal implementation of Wasmtime itself, to "decompile" the component
  • The output is one of these. The JS host generator would then have the goal of translating that to a JS object with bindings.
  • Using Component::imports the imports argument to the JS object are generated (probably mirroring core wasm imports as nested maps or something like that)
  • Next Component:initializers is iterated over which goes through the process of instantiating a component. For example ExtractRealloc would do something like this.reallocN = previous_wasm_instance.exports['the-realloc-name'];
  • The LowerImport variant is where lowering logic is where a JS function is generated to pass to the import of a component. This JS function will do what wit-bindgen-gen-host-js does today taking wasm primitives as arguments and then calling the appropriate JS-imported function, translating the results back to wasm.
  • Finally the Component::exports array is iterated over to create the exported functions on the JS object. These similarly do what wit-bindgen-gen-host-js does today.

Overall the contents of wit-bindgen-gen-host-js will largely be kept but there will be a number of internal refactorings to ensure that all the right wires can be connected to the right place. For example loads/stores will grow a paramter of which memory they're referencing and things like that. By taking an actual component as input this will also provide a path forward to implementing features like non-utf-8 strings, memory64 modules, etc. This'll probably not be implemented for now but will eventually could be implemented in the future as the need arises.

An example is that for this component:

(component
  (import "host-log" (func (result string)))
  ;; ..

  (export "guest-log" (func ...))
)

would generate JS of the form:

class TheComponent { // configurable name from the CLI probably based on the name of the `*.wasm` or something
    async function instantiate(loadWasm: function(string): Promise[WebAssembly.Module], imports: TheComponentImports): TheComponent {
        // lowerings, initializers, etc
    }

    function guest_log(arg: string) {
        // uses `this` to lower `arg` and lift results
    }
}

interface TheComponentImports {
    host_log: function(): string,
}

(please excuse my JS pseudo-code)

Here the loadWasm will be used to asynchronously load the core wasm blobs that wit-bindgen would spit out (in addition to the JS file here). The imports is the imports object generated from the input component. Otherwise everything is entirely internal within the component translation and is part of the bindings.

Open questions are:

  • The binary format currently has no affordances for the names of types, so it's not clear how human-readable names for type bindings will be generated
  • Components-calling-components via adapters should "work" except for the fact that this requires multi-memory-in-core-wasm and I'm not sure if any JS runtimes implement that yet.
  • Documentation as currently lives in *.wit files would not work any more since there's no location in the binary format for that to live today

Hosts with component support (Wasmtime)

The story above for JS and Python-on-the-host will be radically different for Wasmtime-on-the-host since Wasmtime has native support for components. All of the nitty-gritty of lifting and lowering is handled by the wasmtime crate and derived trait implementations on types. This means that wit-bindgen-gen-host-wasmtime-rust actually does "just" a fairly small amount of work and can be more general with the input it ingest than the JS bindings.

The inputs to the Wasmtime generator today are --import and --export files but I believe this should be removed in favor of *.world files. An input *.world file would then have the Wasmtime generator generate associated submodules/traits for all the necessary components. This would be roughly the same shape of today's code generator but the differences I think will be:

  • There should be one types module which is a "soup" of all types mentioned everywhere in the *.world file. This would be how Wasmtime would translate from the structural typing of the component model to the nominal typing of Rust where.
  • A Rust module would be generated for each interface in the *.world files, and the types used in the interface would be use'd, possibly renamed, from the types module generated prior.
  • Imported interfaces would turn into a trait definition.
  • Exported interfaces would probably all be union'd onto one output generated structure. (details TBD)
  • One add_to_linker function would be generated which would project from the T of the Store into &mut U where U: ImportedTrait for all the imported interfaces. This would then register all the appropriate names in wasmtime::component::Linker with the appropriate types. Note that no lifting/lowering happens here, that's all handled by wasmtime.
  • The exported structure would have a method like new_from_instance or similar (same as what's there today) which would then extract all the exports, type-check them, and store TypedFunc references to all the functions.

I think the general structure of Wasmtime's generator won't change much relative to the changes needed for the JS geneator. Largely code is just going to get removed and the input to the generator will change to be a *.world instead of a list of exports/imports. Overall personally I feel like more forcing functions are needed to guide the precise design of the generated code here. Most of this is just me shooting in the dark trying to figure out something that's reasonable, but having more concrete use cases where the above doesn't work would help guide tweaks and refinements to improve the generated interfaces.

Should this all still live in one repository?

I think this is a reasonable question to ask with the above changes. For example the Wasmtime code generator is using *.wit parsing but the JS generator isn't. The JS generator is dealing with lifting/lowering and the Wasmtime code generator isn't. Similarly guests are also pretty different where they're doing lifting/lowering but in the context of their own wasm module as opposed to a guest being instantiated.

In my opinion, though, there's enough shared that I'm still of the opinion that this should all live in the same place. The type hierarchy representation is shared amongst all these use cases for one. Additionally the lifting/lowering details are shared between the guests and JS generator. The *.wit parsing is shared between guests and Wasmtime. While it's not quite "everything shares everything" as-is today I personally feel there's enough overlap for this all to live in the same repository to develop within.

What next?

First and primarily these changes I think need to be agreed upon. These are massive changes for any non-Wasmtime generator, namely taking components as input rather than *.wit files. Even for Wasmtime things are going to change a lot because the core wasm abstraction layer will be going away and instead Wasmtime's component model support will be used. All that's to say that this requires a lot of deep architectural changes for both users of wit-bindgen and wit-bindgen itself, so agreement should be established first.

Even with agreement on a path forward I don't think there's a great story on how to realize all the changes I describe above. The best idea I have personally is to:

  • Implement an independent tool that goes from core wasm modules to components using the canonical ABI name mangling.
  • Use "one giant PR" to atomically move over everything in wit-bindgen to the new architecture.

That "one giant PR" isn't really parallelizable at all and there can't really be any meaningful independent development while that PR is being written unfortunately.

@willemneal
Copy link
Contributor

Looks great! I have a quick question, you mention decompiling the component using wasmtime-environ, could this approach allow for statically linking components?

@alexcrichton
Copy link
Member Author

Perhaps? I don't think I know what you mean by statically linking components though. My current understanding/intention is that the input to the JS runtime would be a single component which internally might have other components within it but that single component wouldn't be able to import other components (similar to the current restrictions of the Wasmtime-based embedding). In that sense you could statically link components together by bundling them into one large component, but I'm not sure if this is what you are asking for.

@willemneal
Copy link
Contributor

willemneal commented Sep 14, 2022

In that sense you could statically link components together by bundling them into one large component, but I'm not sure if this is what you are asking for.

Yeah I'm wondering about where this tooling fits in.

@alexcrichton
Copy link
Member Author

The tooling for actually creating a statically linked component is somewhat orthogonal to wit-bindgen itself and host generators, they'll just need to work with whatever is given. I believe the wasm-tools compose subcommand, the wasm-compose crate, in the wasm-tools repository is the initial work towards creating a tool such as this, though (written by @peterhuene)

@alexcrichton
Copy link
Member Author

I have discovered what is at least a wrinkle and at most a showstopper for implementing this: WASI. The current wasi_snapshot_preview1 imports are not specified with interface types and are not compatible with interface types either. All existing targets that compile to wasm which wit-bindgen works with, however, use WASI targets. For example Rust today uses WASI, C uses WASI, and hypothetical JS, Python, Ruby, and Go targets all are expected to use WASI as well.

I decided to start on this today by doing the bare minimum, produce a component as part of the build process just to make sure it can be done for the tests in this repository. This cannot succeed, however, due to WASI imports. The only recourse at this time is to use a non-WASI target like wasm32-unknown-unknown. That has significant drawbacks, however:

  • Only Rust works with wasm32-unknown-unknown. While C theoretically works I am unaware of any standard toolchain which actually has support for this.
  • All other targets (JS, Python, Ruby, Go, ...) seem highly unlikely to work with "you can't import anything".
  • Even in Rust the support is extremely bare-bones, if an assert! trips or similar there's no way to get a message to the user since stdio, for example, doesn't work.

I'm currently debating with myself whether it's worth it to drop support for C, compile Rust with wasm32-unknown-unkonwn, and just eat the "this is almost impossible to debug" cost. On one hand it is the only way to make progress at this time. On the other hand it is clearly a subpar experience, by a significant amount. The best alternative that I can think of is to, by hand, a multi-memory-using module which adapts wasi_snapshot_preview1 to some custom wit_bindgen_tests_system_interface or something like that which can be specified with the component model. This would, for example, adapt fd_write on fd 1 to some print(x: string) function imported from the host. Such a core wasm module cannot be written in Rust, though, due to the use of multi-memory, so I don't know how to maintain that (and again it has no viability outside this repository).

@alexcrichton
Copy link
Member Author

The issue I raised about preview1 was discussed at today's wit-bindgen meeting and the conclusion was that we'll write a source-level translation which exports preview1-lookalike things and imports, via wit-bindgen generated stubs, "preview2" things. Currently "preview2" doesn't exist yet in a formalized state that wit-bindgen can import so it would be some adaptation.

The nuances would then be:

  • This shim module would perform translation from preview1 to preview2. It would need to be instrumented at componentization time in the following ways:
    • The linear memory would be imported, not exported
    • The stack for this module would be allocated at start with a memory.grow
    • This module would import its function table, and the import would get removed.
    • This module cannot use any data segments
    • This module cannot use any elem segments
    • This module would export cabi_realloc which would return a per-function-call return pointer. This is ideally fitting the use case where each preview1 return value will require at most one return value that needs a malloc
  • The shim module would be gc'd to be the precise size necessary for the preview1 imports being required
  • Support for inserting this shim module would get added to the wit-component tool (eventually wasm-componentize as it develops)

This should provide, as a general purpose shim, a way to migrate from wasi-preview1 to wasi-preview2 in the long term ideally. For now it should provide a reasonble means by which the wit-bindgen tests can be written and run.

alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Sep 30, 2022
This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is bytecodealliance#314, actually
ingesting these components into hosts.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Sep 30, 2022
This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is bytecodealliance#314, actually
ingesting these components into hosts.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 3, 2022
This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is bytecodealliance#314, actually
ingesting these components into hosts.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 4, 2022
This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is bytecodealliance#314, actually
ingesting these components into hosts.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 4, 2022
This commit removes all support for the `resource` and `Handle` types
from the AST of `wit-parser` and all related support in all code
generators. The motivation for this commit is that `wit-bindgen` is on
the cusp of actually being able to work with components: producing a
component from guest output and consuming components in host generators.
More detail about this is in bytecodealliance#314. With components as an intermediate
format, however, there is no way to encode resources since they are not
part of the component model proposal yet.

All is not lost for handles and resources, though. The official design
for handles and resources is being worked on upstream in the component
model repository itself at this time and once added all of this support
will be re-added to `wit-bindgen`. In the meantime though I personally
think that the best way forward is to remove the interim support for a
few reasons:

* Primarily it unblocks progress at this time towards fully integrating
  components and the `wit-bindgen` generators. The requirement to run
  existing tests that use handles would mean that no host generator
  could actually switch to components and/or modes for today's
  core-wasm-lookalike would need to be preserved.

* Otherwise though the semantics of the current handles are basically
  invented out of thin air by myself and were never really formally
  specified, debated, or designed deliberately. I grafted `witx`-style
  handles into `wit-component` and added features as necessary over
  time, but it seems highly unlikely that the handles designed as part
  of the component model will be the ones that `wit-bindgen` currently
  supports. This inevitably means that a new system would need new code
  anyway and would likely result in removal regardless.

As usual git always has the history of handles and this all may come
back in one shape or another if only slightly tweaked. I'm confident in
our history spelunking abilities, though, so I don't feel that keeping
support in the repository is necessary for this purpose.
alexcrichton added a commit that referenced this issue Oct 4, 2022
* Add support to `wit-component` to polyfill WASI

This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is #314, actually
ingesting these components into hosts.

* Update the gc translation macro

* Add gc comments

* Fix a missing `end` instruction on sp initializer

* Preserve the names of globals in adapter modules

Should help with debugging structure ideally

* Small refactor

* Hack in more stack pointer detection

This unfortunately suffers greatly from false negatives, but at this
time it's unclear if this can be done better.

* Improve readability of validation condition

* Improve validation documentation

* Comment a test case
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 4, 2022
This commit removes all support for the `resource` and `Handle` types
from the AST of `wit-parser` and all related support in all code
generators. The motivation for this commit is that `wit-bindgen` is on
the cusp of actually being able to work with components: producing a
component from guest output and consuming components in host generators.
More detail about this is in bytecodealliance#314. With components as an intermediate
format, however, there is no way to encode resources since they are not
part of the component model proposal yet.

All is not lost for handles and resources, though. The official design
for handles and resources is being worked on upstream in the component
model repository itself at this time and once added all of this support
will be re-added to `wit-bindgen`. In the meantime though I personally
think that the best way forward is to remove the interim support for a
few reasons:

* Primarily it unblocks progress at this time towards fully integrating
  components and the `wit-bindgen` generators. The requirement to run
  existing tests that use handles would mean that no host generator
  could actually switch to components and/or modes for today's
  core-wasm-lookalike would need to be preserved.

* Otherwise though the semantics of the current handles are basically
  invented out of thin air by myself and were never really formally
  specified, debated, or designed deliberately. I grafted `witx`-style
  handles into `wit-component` and added features as necessary over
  time, but it seems highly unlikely that the handles designed as part
  of the component model will be the ones that `wit-bindgen` currently
  supports. This inevitably means that a new system would need new code
  anyway and would likely result in removal regardless.

As usual git always has the history of handles and this all may come
back in one shape or another if only slightly tweaked. I'm confident in
our history spelunking abilities, though, so I don't feel that keeping
support in the repository is necessary for this purpose.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 5, 2022
This commit removes all support for the `resource` and `Handle` types
from the AST of `wit-parser` and all related support in all code
generators. The motivation for this commit is that `wit-bindgen` is on
the cusp of actually being able to work with components: producing a
component from guest output and consuming components in host generators.
More detail about this is in bytecodealliance#314. With components as an intermediate
format, however, there is no way to encode resources since they are not
part of the component model proposal yet.

All is not lost for handles and resources, though. The official design
for handles and resources is being worked on upstream in the component
model repository itself at this time and once added all of this support
will be re-added to `wit-bindgen`. In the meantime though I personally
think that the best way forward is to remove the interim support for a
few reasons:

* Primarily it unblocks progress at this time towards fully integrating
  components and the `wit-bindgen` generators. The requirement to run
  existing tests that use handles would mean that no host generator
  could actually switch to components and/or modes for today's
  core-wasm-lookalike would need to be preserved.

* Otherwise though the semantics of the current handles are basically
  invented out of thin air by myself and were never really formally
  specified, debated, or designed deliberately. I grafted `witx`-style
  handles into `wit-component` and added features as necessary over
  time, but it seems highly unlikely that the handles designed as part
  of the component model will be the ones that `wit-bindgen` currently
  supports. This inevitably means that a new system would need new code
  anyway and would likely result in removal regardless.

As usual git always has the history of handles and this all may come
back in one shape or another if only slightly tweaked. I'm confident in
our history spelunking abilities, though, so I don't feel that keeping
support in the repository is necessary for this purpose.
alexcrichton added a commit that referenced this issue Oct 5, 2022
* Remove support for handles and resources

This commit removes all support for the `resource` and `Handle` types
from the AST of `wit-parser` and all related support in all code
generators. The motivation for this commit is that `wit-bindgen` is on
the cusp of actually being able to work with components: producing a
component from guest output and consuming components in host generators.
More detail about this is in #314. With components as an intermediate
format, however, there is no way to encode resources since they are not
part of the component model proposal yet.

All is not lost for handles and resources, though. The official design
for handles and resources is being worked on upstream in the component
model repository itself at this time and once added all of this support
will be re-added to `wit-bindgen`. In the meantime though I personally
think that the best way forward is to remove the interim support for a
few reasons:

* Primarily it unblocks progress at this time towards fully integrating
  components and the `wit-bindgen` generators. The requirement to run
  existing tests that use handles would mean that no host generator
  could actually switch to components and/or modes for today's
  core-wasm-lookalike would need to be preserved.

* Otherwise though the semantics of the current handles are basically
  invented out of thin air by myself and were never really formally
  specified, debated, or designed deliberately. I grafted `witx`-style
  handles into `wit-component` and added features as necessary over
  time, but it seems highly unlikely that the handles designed as part
  of the component model will be the ones that `wit-bindgen` currently
  supports. This inevitably means that a new system would need new code
  anyway and would likely result in removal regardless.

As usual git always has the history of handles and this all may come
back in one shape or another if only slightly tweaked. I'm confident in
our history spelunking abilities, though, so I don't feel that keeping
support in the repository is necessary for this purpose.

* Remove resources from the demo

* Fix rebase conflict
alexcrichton added a commit that referenced this issue Oct 5, 2022
* Remove support for handles and resources

This commit removes all support for the `resource` and `Handle` types
from the AST of `wit-parser` and all related support in all code
generators. The motivation for this commit is that `wit-bindgen` is on
the cusp of actually being able to work with components: producing a
component from guest output and consuming components in host generators.
More detail about this is in #314. With components as an intermediate
format, however, there is no way to encode resources since they are not
part of the component model proposal yet.

All is not lost for handles and resources, though. The official design
for handles and resources is being worked on upstream in the component
model repository itself at this time and once added all of this support
will be re-added to `wit-bindgen`. In the meantime though I personally
think that the best way forward is to remove the interim support for a
few reasons:

* Primarily it unblocks progress at this time towards fully integrating
  components and the `wit-bindgen` generators. The requirement to run
  existing tests that use handles would mean that no host generator
  could actually switch to components and/or modes for today's
  core-wasm-lookalike would need to be preserved.

* Otherwise though the semantics of the current handles are basically
  invented out of thin air by myself and were never really formally
  specified, debated, or designed deliberately. I grafted `witx`-style
  handles into `wit-component` and added features as necessary over
  time, but it seems highly unlikely that the handles designed as part
  of the component model will be the ones that `wit-bindgen` currently
  supports. This inevitably means that a new system would need new code
  anyway and would likely result in removal regardless.

As usual git always has the history of handles and this all may come
back in one shape or another if only slightly tweaked. I'm confident in
our history spelunking abilities, though, so I don't feel that keeping
support in the repository is necessary for this purpose.

* Remove resources from the demo

* smuggle wit information in custom sections

* move transcoder to the crate, and make it available in the cli

* gen-guest-rust can emit custom component-type section

* custom section takes a pub static, not a const

* ComponentEncoder needs to own its Interfaces

so that I can use the Interfaces decoded from the module's custom
section

* make ComponentEncoder always transcode component-type info from custom sections

* flavorful tests: types and functions are actually the same namespace

theyre not in wit, but thats a bug we need to fix, because they are in
component types

* test-helpers: build rust guests with wasm32-unknown-unknown and assert they encode as components

except for "invalid" and "handles" which are not gonna work

* refactor

* gen-guest-rust: now infallible

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 13, 2022
This commit is the implementation of bytecodealliance#314 for the JS host generator.
The commit here covers basically everything in that issue for JS except
it generates a slightly different structure of the JS output. Otherwise
the main highlights are:

* The JS host generator no longer implements the core `Generator` trait
  since it doesn't really fit in the component-as-input world. For now I
  created an `InterfaceGenerator` trait since similar functionality is
  used just not precisely the same. I expect that this will get iterated
  on over time as worlds and other host generators take shape.

* The `wasmtime-environ` crate from Wasmtime itself, typically a "private"
  dependency of Wasmtime, is used to parse the input component and
  generate an `instantiate` function in JS. Wasmtime does all the heavy
  lifting of creating a linear list of initializers for the component
  and this empowers the generator to simply generate JS that is the list
  of initializers.

* The `wit-component` crate is used to "extract" the `Interface`
  descriptions from the input component. This is used to generate type
  information for TypeScript as well as types for lifting/lowering.
  Internally a correlation is done from a lowering/lifting to an
  `Interface` function to connect the dots when instantiating a component.

Lots of pieces needed updating here such as the runtime tests, the
`wit-bindgen` CLI tool, and the demo web page. These are all updated for
the new structure of the JS host generator.

Overall this surprisingly went much more smoothly than I expected. With
`wit-bindgen` being able to extract `Interface` representations from a
component most of the prior JS host code generation was able to be
reused. Along the way I also felt that the addition of "worlds" would
have relatively obvious insertion points and would be relatively easily
handled in all the places.

The demo and runtime tests are proof enough to me at least that this all
works internally, and this feels like a solid foundation to iterate from
with the addition of worlds and continued support for JS hosts.
@alexcrichton
Copy link
Member Author

At this point this issue is nearing completion. #355 has implemented this change for the Wasmtime host generator and #373 is the implementation for JS. The only remaining piece is the wasmtime-py host generator which should be pretty straightforward to simply copy what JS did.

Overall I'm personally feeling quite good about these changes. Everything seems to fit well together and this all feels like a solid technical foundation to continue building on. Namely the world addition to *.wit I feel will fit naturally within the new structure of all the generators and throughout wit-component as well.

alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 13, 2022
This commit is the implementation of bytecodealliance#314 for the JS host generator.
The commit here covers basically everything in that issue for JS except
it generates a slightly different structure of the JS output. Otherwise
the main highlights are:

* The JS host generator no longer implements the core `Generator` trait
  since it doesn't really fit in the component-as-input world. For now I
  created an `InterfaceGenerator` trait since similar functionality is
  used just not precisely the same. I expect that this will get iterated
  on over time as worlds and other host generators take shape.

* The `wasmtime-environ` crate from Wasmtime itself, typically a "private"
  dependency of Wasmtime, is used to parse the input component and
  generate an `instantiate` function in JS. Wasmtime does all the heavy
  lifting of creating a linear list of initializers for the component
  and this empowers the generator to simply generate JS that is the list
  of initializers.

* The `wit-component` crate is used to "extract" the `Interface`
  descriptions from the input component. This is used to generate type
  information for TypeScript as well as types for lifting/lowering.
  Internally a correlation is done from a lowering/lifting to an
  `Interface` function to connect the dots when instantiating a component.

Lots of pieces needed updating here such as the runtime tests, the
`wit-bindgen` CLI tool, and the demo web page. These are all updated for
the new structure of the JS host generator.

Overall this surprisingly went much more smoothly than I expected. With
`wit-bindgen` being able to extract `Interface` representations from a
component most of the prior JS host code generation was able to be
reused. Along the way I also felt that the addition of "worlds" would
have relatively obvious insertion points and would be relatively easily
handled in all the places.

The demo and runtime tests are proof enough to me at least that this all
works internally, and this feels like a solid foundation to iterate from
with the addition of worlds and continued support for JS hosts.
alexcrichton added a commit to alexcrichton/witx-bindgen that referenced this issue Oct 13, 2022
This commit is the implementation of bytecodealliance#314 for the JS host generator.
The commit here covers basically everything in that issue for JS except
it generates a slightly different structure of the JS output. Otherwise
the main highlights are:

* The JS host generator no longer implements the core `Generator` trait
  since it doesn't really fit in the component-as-input world. For now I
  created an `InterfaceGenerator` trait since similar functionality is
  used just not precisely the same. I expect that this will get iterated
  on over time as worlds and other host generators take shape.

* The `wasmtime-environ` crate from Wasmtime itself, typically a "private"
  dependency of Wasmtime, is used to parse the input component and
  generate an `instantiate` function in JS. Wasmtime does all the heavy
  lifting of creating a linear list of initializers for the component
  and this empowers the generator to simply generate JS that is the list
  of initializers.

* The `wit-component` crate is used to "extract" the `Interface`
  descriptions from the input component. This is used to generate type
  information for TypeScript as well as types for lifting/lowering.
  Internally a correlation is done from a lowering/lifting to an
  `Interface` function to connect the dots when instantiating a component.

Lots of pieces needed updating here such as the runtime tests, the
`wit-bindgen` CLI tool, and the demo web page. These are all updated for
the new structure of the JS host generator.

Overall this surprisingly went much more smoothly than I expected. With
`wit-bindgen` being able to extract `Interface` representations from a
component most of the prior JS host code generation was able to be
reused. Along the way I also felt that the addition of "worlds" would
have relatively obvious insertion points and would be relatively easily
handled in all the places.

The demo and runtime tests are proof enough to me at least that this all
works internally, and this feels like a solid foundation to iterate from
with the addition of worlds and continued support for JS hosts.
alexcrichton added a commit that referenced this issue Oct 13, 2022
* js: Take a component as input, not interfaces

This commit is the implementation of #314 for the JS host generator.
The commit here covers basically everything in that issue for JS except
it generates a slightly different structure of the JS output. Otherwise
the main highlights are:

* The JS host generator no longer implements the core `Generator` trait
  since it doesn't really fit in the component-as-input world. For now I
  created an `InterfaceGenerator` trait since similar functionality is
  used just not precisely the same. I expect that this will get iterated
  on over time as worlds and other host generators take shape.

* The `wasmtime-environ` crate from Wasmtime itself, typically a "private"
  dependency of Wasmtime, is used to parse the input component and
  generate an `instantiate` function in JS. Wasmtime does all the heavy
  lifting of creating a linear list of initializers for the component
  and this empowers the generator to simply generate JS that is the list
  of initializers.

* The `wit-component` crate is used to "extract" the `Interface`
  descriptions from the input component. This is used to generate type
  information for TypeScript as well as types for lifting/lowering.
  Internally a correlation is done from a lowering/lifting to an
  `Interface` function to connect the dots when instantiating a component.

Lots of pieces needed updating here such as the runtime tests, the
`wit-bindgen` CLI tool, and the demo web page. These are all updated for
the new structure of the JS host generator.

Overall this surprisingly went much more smoothly than I expected. With
`wit-bindgen` being able to extract `Interface` representations from a
component most of the prior JS host code generation was able to be
reused. Along the way I also felt that the addition of "worlds" would
have relatively obvious insertion points and would be relatively easily
handled in all the places.

The demo and runtime tests are proof enough to me at least that this all
works internally, and this feels like a solid foundation to iterate from
with the addition of worlds and continued support for JS hosts.

* Rebase and fix tests

* Pull in latest `wasmtime` where `main` now works
* Add a `testwasi` implementation for JS and use it in all tests
* Add a dummy `printf` to a C test to ensure it imports `testwasi` like
  the other languages.
* Add a `fd_fdstat_get` stub to make C happy
* Update `fd_write` to only work for fd 1

* Review comments
@alexcrichton
Copy link
Member Author

This is now finished with the update to the wasmtime-py generator, so I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants