Overhaul how type information gets to the CLI #124

alexcrichton · 2018-04-13T16:06:57Z

This commit is a complete overhaul of how the #[wasm_bindgen] macro
communicates type information to the CLI tool, and it's done in a somewhat...
unconventional fashion.

Today we've got a problem where the generated JS needs to understand the types
of each function exported or imported. This understanding is what enables it to
generate the appropriate JS wrappers and such. We want to, however, be quite
flexible and extensible in types that are supported across the boundary, which
means that internally we rely on the trait system to resolve what's what.

Communicating the type information historically was done by creating a four byte
"descriptor" and using associated type projections to communicate that to the
CLI tool. Unfortunately four bytes isn't a lot of space to cram information like
arguments to a generic function, tuple types, etc. In general this just wasn't
flexible enough and the way custom references were treated was also already a
bit of a hack.

This commit takes a radical step of creating a descriptor function for each
function imported/exported. The really crazy part is that the wasm-bindgen CLI
tool now embeds a wasm interpreter and executes these functions when the CLI
tool is invoked. By allowing arbitrary functions to get executed it's now much
easier to inform wasm-bindgen about complicated structures of types. Rest
assured though that all these descriptor functions are automatically unexported
and gc'd away, so this should not have any impact on binary sizes

A new internal trait, WasmDescribe, is added to represent a description of all
types, sort of like a serialization of the structure of a type that
wasm-bindgen can understand. This works by calling a special exported function
with a u32 value a bunch of times. This means that when we run a descriptor we
effectively get a Vec<u32> in the wasm-bindgen CLI tool. This list of
integers can then be parsed into a rich enum for the JS generation to work
with.

This commit currently only retains feature parity with the previous
implementation. I hope to soon solve issues like #123, #104, and #111 with this
support.

fitzgen · 2018-04-13T20:57:12Z

crates/backend/src/ast.rs

+}
+
+impl Enum {
+    fn shared(&self) -> shared::Enum {


trait IntoSharedIr { type Shared; fn shared(&self) -> Self::Shared; } // ... impl IntoSharedIr for Enum { type Shared = shared::Enum; fn shared(&self) -> shared::Enum { // ... } } // ...

?

fitzgen · 2018-04-13T21:17:53Z

crates/backend/src/codegen.rs

+            (description.len() >> 0) as u8,
+            (description.len() >> 8) as u8,
+            (description.len() >> 16) as u8,
+            (description.len() >> 24) as u8,


I know this is just code motion, but why is this manual byte ordering here? Feels like something that deserves a comment.

fitzgen · 2018-04-13T22:17:15Z

This commit takes a radical step of creating a descriptor function for each
function imported/exported. The really crazy part is that the wasm-bindgen CLI
tool now embeds a wasm interpreter and executes these functions when the CLI
tool is invoked. By allowing arbitrary functions to get executed it's now much
easier to inform wasm-bindgen about complicated structures of types. Rest
assured though that all these descriptor functions are automatically unexported
and gc'd away, so this should not have any impact on binary sizes

A new internal trait, WasmDescribe, is added to represent a description of all
types, sort of like a serialization of the structure of a type that
wasm-bindgen can understand. This works by calling a special exported function
with a u32 value a bunch of times. This means that when we run a descriptor we
effectively get a Vec in the wasm-bindgen CLI tool. This list of
integers can then be parsed into a rich enum for the JS generation to work
with.

This seems like a good comment to have at the top of some module -- maybe describe.rs?

alexcrichton · 2018-04-14T18:04:47Z

Thanks for taking a look @fitzgen! There's actually a lot more changes coming soon as well to fix the issues mentioned above, so I think I'm gonna defer rewriting and updating DESIGN.md but I'll be sure to get around to that soon.

This commit is a complete overhaul of how the `#[wasm_bindgen]` macro communicates type information to the CLI tool, and it's done in a somewhat... unconventional fashion. Today we've got a problem where the generated JS needs to understand the types of each function exported or imported. This understanding is what enables it to generate the appropriate JS wrappers and such. We want to, however, be quite flexible and extensible in types that are supported across the boundary, which means that internally we rely on the trait system to resolve what's what. Communicating the type information historically was done by creating a four byte "descriptor" and using associated type projections to communicate that to the CLI tool. Unfortunately four bytes isn't a lot of space to cram information like arguments to a generic function, tuple types, etc. In general this just wasn't flexible enough and the way custom references were treated was also already a bit of a hack. This commit takes a radical step of creating a **descriptor function** for each function imported/exported. The really crazy part is that the `wasm-bindgen` CLI tool now embeds a wasm interpreter and executes these functions when the CLI tool is invoked. By allowing arbitrary functions to get executed it's now *much* easier to inform `wasm-bindgen` about complicated structures of types. Rest assured though that all these descriptor functions are automatically unexported and gc'd away, so this should not have any impact on binary sizes A new internal trait, `WasmDescribe`, is added to represent a description of all types, sort of like a serialization of the structure of a type that `wasm-bindgen` can understand. This works by calling a special exported function with a `u32` value a bunch of times. This means that when we run a descriptor we effectively get a `Vec<u32>` in the `wasm-bindgen` CLI tool. This list of integers can then be parsed into a rich `enum` for the JS generation to work with. This commit currently only retains feature parity with the previous implementation. I hope to soon solve issues like #123, #104, and #111 with this support.

fitzgen reviewed Apr 13, 2018

View reviewed changes

konstin mentioned this pull request Apr 14, 2018

Support new Foo(...) to fix #115 #127

Merged

alexcrichton force-pushed the describe branch from 90cb201 to 3305621 Compare April 14, 2018 18:15

alexcrichton merged commit 3305621 into master Apr 14, 2018

alexcrichton deleted the describe branch April 14, 2018 18:20

alexcrichton mentioned this pull request Apr 16, 2018

Update DESIGN.md on recent closure/conversion changes #135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overhaul how type information gets to the CLI #124

Overhaul how type information gets to the CLI #124

alexcrichton commented Apr 13, 2018

fitzgen Apr 13, 2018

fitzgen Apr 13, 2018

fitzgen commented Apr 13, 2018

alexcrichton commented Apr 14, 2018

Overhaul how type information gets to the CLI #124

Overhaul how type information gets to the CLI #124

Conversation

alexcrichton commented Apr 13, 2018

fitzgen Apr 13, 2018

Choose a reason for hiding this comment

fitzgen Apr 13, 2018

Choose a reason for hiding this comment

fitzgen commented Apr 13, 2018

alexcrichton commented Apr 14, 2018