Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for end-to-end wasi-threads support #10

Closed
11 of 15 tasks
abrown opened this issue Oct 25, 2022 · 17 comments
Closed
11 of 15 tasks

Tracking issue for end-to-end wasi-threads support #10

abrown opened this issue Oct 25, 2022 · 17 comments

Comments

@abrown
Copy link
Collaborator

abrown commented Oct 25, 2022

Several of us (@loganek, @sunfishcode, @sunfishcode, @haraldh) have been working towards implementing all of the pieces to demonstrate an end-to-end wasi-threads example. The current direction is to implement this in Wasmtime, though @loganek has also opened a PR to do so in WAMR (#1638). To that end, this issue is meant to track all of the various parts needed not only to show a proof-of-concept, but to upstream enough code for (fearless) users to try out this new functionality. I do not expect this to be a comprehensive plan, but only to implement what is necessary for the "stage 1" functionality described by @alexcrichton here (i.e., no component model integration).

We can split this into areas and I've made an effort to try to order the tasks within these.

Specification

  • spec: consider adding wasi_thread_exit for early return from a thread (#7); will require some implementation in wasi-libc and wasmtime
  • spec: fix CI issues (#26)

Toolchains

  • toolchain: allow importing and exporting the same memory in wasm-ld; done in D135898
  • toolchain: allow compiling programs with wasm32-wasi-threads (#326); in progress at #331 and #274
  • toolchain: teach the toolchains to import shared memories by default; it seems most natural to have both the parent and child threads import a memory versus exporting it 1; under discussion at #502

Libraries

  • wasi-libc: only run C constructors once (#339); ready to merge
  • wasi-libc: initial thread-local values must be memcpy-ed to the TLS area; this seems to be handled by some combination of #342 and #343
  • wasi-libc: resolve spinlock questions to merge pthread spinlock support (#324)
  • wasi-libc: create a way to run libc-test tests; in progress at #369
  • wasi-libc: audit for any missing pthreads APIs — the fundamentals are there but we need to determine what functions remain to be enabled

Engines

I'm completely open adding/removing/editing the items above as well as moving this issue somewhere else but I felt it would be helpful to keep track of the state of things.

Footnotes

  1. this also likely involves figuring out a better place to create the shared memory initially in Wasmtime (instead of here)

@haraldh
Copy link

haraldh commented Oct 26, 2022

  • wasmtime: implement wait and notify; on Linux, this could use futex as done here by @haraldh but we need a solution for all OSes (also, is the 32-bit limitation a problem?)

So, there is the atomic-wait crate written by Mara Bos. Here is what she wrote about the other OSes:

Cross platform atomic wait and wake (aka futex) functionality.

This crate only supports functionality that's available on all of Linux, Windows, and macOS. That is:

  • Only AtomicU32 is supported. (Linux currently only supports 32-bit futexes.)
  • Only the "wait", "wake one", and "wake all" operations are supported. (Linux supports more operations, but Windows and macOS don't.)
  • No timeouts. (macOS doesn't have a stable/public API for timeouts.)
  • The wake operations don't return the number of threads woken up. (Only Linux supports this.)

So I don't know what OS the person used, whoever specified memory.atomic.notify, memory.atomic.wait32 and memory.atomic.wait64.

@haraldh
Copy link

haraldh commented Oct 26, 2022

wasi_thread_kill would also be a good extension

@sbc100
Copy link
Member

sbc100 commented Oct 26, 2022

wasi_thread_kill would also be a good extension

Do you mean pthread_exit, or something to implement pthread_cancel or something to implement pthread_kill (which is oddly names since it delivers signals and doesn't control the lifetime of a thread).

@haraldh
Copy link

haraldh commented Oct 27, 2022

wasi_thread_kill would also be a good extension

Do you mean pthread_exit, or something to implement pthread_cancel or something to implement pthread_kill (which is oddly names since it delivers signals and doesn't control the lifetime of a thread).

Yeah, something to implement pthread_cancel I had in mind.

@sbc100
Copy link
Member

sbc100 commented Oct 27, 2022

So there are two types of pthread cancellation: DEFERRED and ASYNCHRONOUS : https://man7.org/linux/man-pages/man3/pthread_setcanceltype.3.html.

IIUC deferred cancellation is much simpler and doesn't need any OS support, its all implementable in user space. Async cancellation is much harder and normal relies on some kind of async signal primitive under the hood. I have previously proposed that we simply ignore asynchronous cancellation since async signal very different to anything that exists in wasi/wasm today. I would also argue that very few real world program depend on it, but my information here is obviously not complete.

@abrown abrown changed the title Tracking issue for end-to-end support Tracking issue for end-to-end wasi-threads support Nov 9, 2022
@abrown
Copy link
Collaborator Author

abrown commented Nov 10, 2022

wasmtime: implement wait and notify; on Linux, this could use futex as done here by @haraldh but we need a solution for all OSes (also, is the 32-bit limitation a problem?)

@haraldh, it looks like V8 bottoms out in their own FutexEmulation implementation, which is used for all platforms, even Linux.

@abrown
Copy link
Collaborator Author

abrown commented Nov 15, 2022

I've created bytecodealliance/wasmtime#5274 to make a way to measure how much (or little) locking will affect WASI performance.

@turbolent
Copy link

Once spec or some form of docs/description, and some example/test binaries are available, I'd love to implement support in https://github.com/turbolent/w2c2.

@loganek
Copy link
Collaborator

loganek commented Dec 6, 2022

Once spec or some form of docs/description, and some example/test binaries are available, I'd love to implement support in https://github.com/turbolent/w2c2.

@turbolent This might be helpful: #11

@yamt
Copy link
Contributor

yamt commented Dec 11, 2022

in this weekend, i implemented wasi-threads for toywasm.
while it still has rough edges, it can run toywasm itself built for wasi, which uses pthread with the latest wasi-sdk.

@turbolent
Copy link

@loganek Thank you, I'll give that a try 👍
@yamt Congrats, great work!

@loganek
Copy link
Collaborator

loganek commented Dec 13, 2022

FYI WAMR implementation is being tracked here: bytecodealliance/wasm-micro-runtime#1790

@abrown
Copy link
Collaborator Author

abrown commented Jan 19, 2023

For those interested, I updated the description for this issue with the state of the latest PRs. With a highly-customized environment (e.g., bytecodealliance/wasmtime#5484), I have a benchmark that runs parallel compression using wasi-threads in Wasmtime. I think @yamt has also shown how to run ffmpeg in parallel somewhere (with Wasmtime or his toy engine, I'm not sure).

In my view, the current state of this issue is that the main check boxes above are all covered by applicable PRs. We simply need to clean up and merge those PRs for wasi-threads to be generally available for testing. It might take a while for things to start showing up in various releases and I would expect some bugs to be found in various projects, but my impression is that this effort is almost complete.

@loganek
Copy link
Collaborator

loganek commented Jan 20, 2023

Should we target including the API in preview2? I don't mean to rush, but feels like we already are quite confident about the API, have a few work-in-progress/completed implementations and most of the important ambiguities have been clarified.

@abrown
Copy link
Collaborator Author

abrown commented Jan 20, 2023

@loganek, that's a good idea. @sunfishcode, any thoughts on that?

@sunfishcode
Copy link
Member

Because of wasi-thread's decision to go forward with instance-per-thread, I myself don't know how this could be be possible within any predictable timeframe.

@abrown
Copy link
Collaborator Author

abrown commented Feb 23, 2023

I am going to close this issue since the main idea of it — tracking the implementation of wasi-threads end-to-end — is now upstreamed in both Wasmtime and WAMR and all of the toolchain work to get here is contained in the wasi-sdk-20+threads pre-release of wasi-sdk. As you may notice, not all of the checkboxes in the original issue are checked but this is fine: over the last few months things have progressed in ways that make some of the items unnecessary and the necessary ones to continue on as separate issues or PRs. I tried to describe all of the work done here (by me and others!) in this blog post: https://bytecodealliance.org/articles/wasi-threads. Thanks to everyone who contributed!

@abrown abrown closed this as completed Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants