-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully support multiple returns in Wasmtime #2806
Fully support multiple returns in Wasmtime #2806
Conversation
23a918d
to
5eff77c
Compare
For quite some time now Wasmtime has "supported" multiple return values, but only in the mose bare bones ways. Up until recently you couldn't get a typed version of functions with multiple return values, and never have you been able to use `Func::wrap` with functions that return multiple values. Even recently where `Func::typed` can call functions that return multiple values it uses a double-indirection by calling a trampoline which calls the real function. The underlying reason for this lack of support is that cranelift's ABI for returning multiple values is not possible to write in Rust. For example if a wasm function returns two `i32` values there is no Rust (or C!) function you can write to correspond to that. This commit, however fixes that. This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and `WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for everything *except* how multiple values are returned. For multiple return values we simply define our own version of the ABI which Wasmtime implements, which is that for N return values the first is returned as if the function only returned that and the latter N-1 return values are returned via an out-ptr that's the last parameter to the function. These custom ABIs provides the ability for Wasmtime to bind these in Rust meaning that `Func::wrap` can now wrap functions that return multiple values and `Func::typed` no longer uses trampolines when calling functions that return multiple values. Although there's lots of internal changes there's no actual changes in the API surface area of Wasmtime, just a few more impls of more public traits which means that more types are supported in more places! Another change made with this PR is a consolidation of how the ABI of each function in a wasm module is selected. The native `SystemV` ABI, for example, is more efficient at returning multiple values than the wasmtime version of the ABI (since more things are in more registers). To continue to take advantage of this Wasmtime will now classify some functions in a wasm module with the "fast" ABI. Only functions that are not reachable externally from the module are classified with the fast ABI (e.g. those not exported, used in tables, or used with `ref.func`). This should enable purely internal functions of modules to have a faster calling convention than those which might be exposed to Wasmtime itself. Closes bytecodealliance#1178
5eff77c
to
81e96c0
Compare
If the wasm C ABI adopts multiple return values as a way to return small-ish structs, and if a hypothetical future Rust supports this when returning Ah, and as a minor bikeshed, rather than "WasmtimeSystemV" and "WasmtimeFastcall", what would you think of a more explicit name, like "SingleRetSystemV" and "SingleRetFastcall" or so, to help keep it clear what its purpose is? |
Can you clarify what you mean about the wasm C ABI? The current proposed wasm-c-api For the naming bits, I would personally think that we should probably keep "wasmtime" in the name to give ourselves the liberty to easily change it in the future. For example baldrdash has a whole bunch of custom conventions and such throughout cranelift, and I could imagine us wanting to add something like that for wasmtime in the future to optimize various bits and pieces here and there. |
Ah, I meant the toolchain C ABI used by clang, where we've talked about adding multi-value for small structs once it's sufficiently supported in engines. If WasmtimeSystemV etc. are meant to adapt to other Wasmtime needs in the future, should we document that they aren't ABI-stable, similar to Fast and Cold? |
Ah I see! I think, yes, we're able to optimize this in the future without any user-visible changes. We don't expose anything about the ABI details from the And sure, I can document that these are unstable ABIs. |
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:wasm", "lightbeam", "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
This was overwriting upper bits when 32-bit registers were being stored into return values, so fix the code inline to do a sized store instead of one-size-fits-all store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
I spent the most time looking over the Cranelift ABI changes; I didn't look as closely at the trampoline/function-registry/translation bits, but I can do a deeper review of those parts too if you'd like.
Once thing that would be useful to test more thoroughly, I think, is the robustness of calls between Wasmtime-ABI and Fast-ABI functions within a Wasm module. So far we have mostly exercised same-ABI-to-same-ABI calls; there are some interesting corner cases in the code to handle calling a function of a different ABI (mostly due to differing clobber sets actually, but in general it is a different case).
So given that, perhaps we could add some filetests with the two interesting combos (wasmtime -> fast, fast -> wasmtime) and varying numbers of args and rets, and just assert a golden compilation result? We could also add some run tests that pass multiple args/rets, with different alignment requirements, etc.
Ok I think I got everything working now! Since it was a very recent change, @cfallin can you confirm you saw 95f86be? That was necessary to fix some segfaults that were cropping up on windows since the 4-byte return values were getting stored as 8-byte results. The comment there makes me suspicious though... I made a stab at getting this implemented in the old backend but I ended up giving up. The old backend implements multi-value returns very differently than the new backend. The old backend spills everything to the stack if it doesn't all fit in registers, which isn't the behavior that we want here (one in registers everything else on the stack). I couldn't figure out how to update the old backend to do this but I figured it wasn't really that necessary. I just removed some trait impls for the old backend in the I'll tag @peterhuene for the review of the wasmtime bits, thanks for looking at the Cranelift bits @cfallin! |
That was indeed a suspicious comment. I am not sure exactly what the original intent was but I double-checked just now that this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So these changes look great and I really like how TypedFunc
works now.
Initially I had some concerns over emitting Windows unwind information for CallConv::Fast
functions, but I can't think of an incompatibility that might ensue in the current form.
Windows unwind information should be able to represent the systemv prologues that CallConv::Fast
uses and the epilogues are the same as Windows' calling convention (that's actually more important because epilogues on Windows, at least for x64, must be of a particular form).
I guess my remaining concern over the mixing of calling conventions is if CallConv::Fast
changes in a subtle way that creates invalid Windows unwind information, thereby making it such that we can't unwind through that frame properly.
Perhaps we should add a Windows test where we trap with several "internal" CallConv::Fast
frames on the trace and we can properly get the trap's backtrace?
I was a little worried about the unwind info myself (and kept switching it back and forth from @cfallin also were you thinking of more than the tests I added in 5962d4c? I tried to break things in various ways but I think it all seems to be working pretty well so far at least. |
@alexcrichton That's very thorough, thanks! I may be missing something but I don't see any |
Ok I've managed to get all tests green and I think I've got all the various tests here in place. It's of course always easy though to use the same abi everywhere if any issues arise, but I'm gonna go ahead and merge this. If there are more tests desired though I'm happy to add some! |
Just a note for future reference that this commit has also caused another regression on our side, causing stack overflows on Windows in async tests. I unfortunately don't have more to share, because debugging doesn't work reliably: the called wasm function panics (which is the subject of the test, so it's supposed to do so). When running in the VSCode debugger I hit SIGILL and then the program aborts, while running outside of a debugger there's a stack overflow somewhere. I haven't managed to make an isolated test case either, since I don't have any clues about what's causing the failure. |
For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use
Func::wrap
with functions that return multiplevalues. Even recently where
Func::typed
can call functions that returnmultiple values it uses a double-indirection by calling a trampoline
which calls the real function.
The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two
i32
values there is no Rust (orC!) function you can write to correspond to that. This commit, however
fixes that.
This commit adds two new ABIs to Cranelift:
WasmtimeSystemV
andWasmtimeFastcall
. The intention is that these Wasmtime-specific ABIsmatch their corresponding ABI (e.g.
SystemV
orWindowsFastcall
) foreverything except how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.
These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that
Func::wrap
can now wrap functions that returnmultiple values and
Func::typed
no longer uses trampolines whencalling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!
Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native
SystemV
ABI,for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with
ref.func
).This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.
Closes #1178