-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebase the Phase Process description on the CG's current process #549
Rebase the Phase Process description on the CG's current process #549
Conversation
The [CG Phase Process document] has recently split out the entry requirements for each stage from the activities that happen within each stage, fixing an ambiguity about what happens before a stage and what happens within a stage. It also contains a number of generally useful updates. This PR updates the WASI Phase Process using wording derived from the CG Phase Process, adapting it to meet WASI's needs. The resulting process is roughly the same as the existing process, however I've made it more specific in a few areas: - Phase 2 requires a wit description. - Phase 3 requires there be a plan for how the phase 4 accpetance criteria will be met. [CG Phase Process document]: https://github.com/WebAssembly/meetings/blob/main/process/phases.md
From the PR, for phase 2:
Several recent conversations have made me pause and consider what this change might do to the ecosystem. Not all of the implications of this change are clear to everyone and I think we should make them clear here to avoid discontent in the future. To my eyes, the WebAssembly ecosystem is already quite fractured (have you tried building a module that runs on a standalone engine AND the web?) and one of the bright spots was the agreement from several different corners of the web on the WASI standard. I've heard the concern that binding WASI to WIT — and implicitly to the component model — might cause fractures (e.g., other standards or non-standard APIs). Let me propose some questions so that someone can clarify where all this is headed. I'll do it from various perspectives:
I think I know the answer to some of these questions and some of them have been discussed in the past (cc: @sunfishcode, @pchickey) but I think making all of this explicit will be helpful. |
We wish the component model is not mandatory for WASI. Component model introduce complexity and additional resource requirement, and it is not always wanted. For exmple the footprint is a hard requirement for embedded and IoT usages, we wish we can still use wasi for these domains. |
my understanding is that having a WIT-defined interface doesn't imply to require component-model support. my impression is that WIT-defined interfaces are often less efficient if you compare them with an abi based on bare linear memory pointers though. |
I agree, with @abrown here answers to these questions would be helpful for many in the ecosystem. To @abrown's observation, the answers to these questions are often known inside the team working on the Component Model. It's just a change in perspective; approaching the Component Model from the outside in. I know many folks may be worried about stating known limitations. But I think it's an opportunity for community engagement. For instance, I'm aware that there is scope of performance improvements in how the marshalling between components work. For those enthusiastic about the component model it points out an area where contributions and community focus may be welcome. On the general topic of performance and overhead; To @xwang98's point, we have similar concerns, but feel that getting some hard data on the performance and impact would be great. It would allow everyone to assess it's suitability for their particular use cases. We'd love to see some performance metrics / data. But being realistic - I also know this isn't going to be possible until after preview 2 is released. Again, stating this as an area of contribution or focus, following the release of Preview 2 would be great. Regarding the engine impact, would it be possible to get some some engineering guidance from those that have implemented the component model in an engine already - I'm guessing this may be the Wasmtime team? This would help address the concerns of other runtimes and provide guidance on the suitability of the component model for various domains. Maybe a future blog post or interview? - just a thought. |
Thank you for making raising these questions this explicitly. I agree that it makes sense to explicitly work through them for other interested parties, so I'll walk through them in way more detail than you personally would need. (Note: this got very long, and I apologize. I'd highly recommend reading the first section, and then those Q&A entries you're interested in.) One thing I want to emphasize is that nothing has changed about any of this in a long time!The particular phase 2 requirement you mention ("A wit description of the API exists") has been in place since 2021, and explicitly and fairly prominently mentioned in the main README since early February 2022. The more fundamental approach of basing WASI on top of another standard to define the ABI is even older: the very first WASI overview document from when WASI was announced in April 2019 mentions WASI gaining support for "Host Bindings". Since then, the Host Bindings proposal merged with the Module Linking proposal into what's now the Component Model. That very first WASI overview also already includes the reason for moving towards defining WASI in terms of Host Bindings: the ABI approach used by WASI Preview 1 works very well for languages like C, C++, and Rust, which use linear memory. It doesn't work well at all for languages like Kotlin and Dart, which use Wasm GC. That's because it fundamentally assumes that there is a linear memory heap to read values from and write them into. With the Component Model's approach of defining the ABI in terms of canonical lowering and lifting operations for each data type, we can directly and efficiently support languages using GC by defining lifting and lowering operations that operate on GC objects. Based on this, another thing I want to highlight is that not using an approach like the Component Model's WIT-based APIs means no real support for Kotlin, Dart, and other languages using Wasm GCIn summary, this PR clarifies some aspects of the process, but doesn't in any way represent a change in direction. @xwang98 the above means that the direction WASI is on hasn't changed, and that being based on what is now the Component Model has been part of the design since the very beginning. This approach has since been confirmed a number of times, and changing it now would not only mean discarding many person-years of work, but also require coming up with a different approach to at least some of the goals. E.g. I'm sure you agree that not supporting languages using Wasm GC isn't really an option. With that all out of the way, I'll try to answer @abrown's questions below, as well as some of the concerns @woodsmc raised. Q&A
As @yamt says: no, that's not required to support content that'd otherwise use a WASI Preview1-style ABI using WITX. WebAssembly Components define a new binary format, but to support content that works in roughly the same way as Wasm core modules targeting Preview1, all that's needed is the ability to "unwrap" the core modules contained in these components, and then communicating with those via the canonical ABI. This ABI is roughly equivalent to the ABI witx defines and which is used in WASI Preview1. This approach is e.g. taken by the JCO toolchain to support running components in JS engines such as browsers or Node.js which (for now) lack native support. Another strong proof that this works is that we have multiple toolchains able to produce Components, despite none of them actually emitting the new binary format. Instead, they all emit core Wasm modules with the Component Model's ABI, which are then turned into Component binaries using external tooling.
WASI has as part of its fundamental design goals extremely high security standards, the ability to treat all languages as first-class citizens—instead of just ones that behave like C—and the ability to make all APIs fully virtualizable. (I.e., to enable all APIs to be implemented in WebAssembly and with the same privileges of all other content.) Based on these goals, WASI (by virtue of being based on the Component Model) introduces two major constraints that can impose limitations on some API designs:
I don't think that any of this means that there is any kind of functionality that WIT fundamentally can't expose an API for. It's true however that some API designs won't work, and will need other approaches. I think the most important reason for that is the need for all APIs to be language-agnostic though. E.g. shared-everything multi-threading doesn't really mean the same thing for languages using linear memory and those using Wasm GC. Features of that nature are I think best handled as Component-internal, much like Wasm GC itself, exception handling, or stack switching.
Content will run in those runtimes that support the ABI and binary format it uses, and that implement the APIs it requires. There's absolutely nothing stopping runtimes from supporting all kinds of different ABIs and binary formats, so the answer to the first question is definitely "yes". Additionally, it's possible to fully support WASI Preview1-targeting content inside the component model. The Wasmtime project has been working on an adapter for just that, which should enable all Preview2-supporting runtimes to support Preview1 without maintaining multiple WASI implementations in parallel. Since WASI Preview2 introduces a whole host of additional functionality, such as the wasi-http API, it's not really possible to support Preview2 in runtimes that don't support these interfaces. But as mentioned above, runtimes can choose to only implement support for the canonical ABI and running single modules, instead of supporting linked Components as well.
Not in any way that's not there for WITX as well, and in fact inherent to WebAssembly in general. The fact that content runs inside a tight sandbox means that one can't just expose arbitrary regions of memory and operate on it without copying. The Component Model does however introduce a concept that significantly reduces this overhead: resources and resource handles. These can represent a large collection of values without having to copy them, and enables operating on them via associated functions/methods. @woodsmc, I hope I was able to address some of your concerns with the above, but I'm very happy to discuss things in more detail (though maybe not as part of a PR that's not actually about any of this 😉) |
Thanks @tschneidereit .
Awesome; I know this is a perspective shift - rather than addressing our own internal community, we are answering the anticipated questions of others. I love this. Mainly because I get asked these types of questions from my organization frequently. Having them explicitly addressed lowers the barrier for entry and aids with technology adoption. It would be amazing to have these points addressed explicitly in some "published" form, an addition to this document, or another. So thanks again! RE: Performance / other issues - delighted to take it out of the PR. I'll reach out. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some inconsistency of where acceptance criteria is checked, maybe needs some clarification.
Contributing.md
Outdated
Note: While we mostly follow the [WebAssembly CG's Phase Process], the requirements around Web VM implementation, formal notation and the reference interpreter don't apply in the context of WASI. | ||
Entry requirements: | ||
|
||
* The phase 4 acceptance criteria are documented in the proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it meant to be phase 3? There is a reference to the acceptance criteria there, while phase 4 seems to be "TBD"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up.
The "phase 4 acceptance criteria" are defined here, and are attended to be a substitute for the "two Web VMs" requirement in the corresponding phase 4 in the CG process.
Phase 4 in the CG process means "the feature is done and handed off to the WG for proper standardization". For WASI, we aren't ready for anything analogous to that yet; we're still working on what we're calling "Preview 2".
So, this PR proposes defining "Preview 2" as requiring proposals be at only phase 3. But, we do want Preview 2 proposals to have something of the level of confidence that "Two Web VMs" have, so we added this language about the "phase 4 acceptance criteria" as a Preview 2 requirement.
Does that clarify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's just me, but currently "phase 4 acceptance criteria" sounds like "acceptance criteria to a stage we haven't yet defined", also logically requiring phase 4 before phase 3 doesn't make very strong sense. Maybe have the criteria point to phase 3 officially or give them a phase-neutral name (how about "Preview 2 acceptance criteria", since that is the end goal)? Or define a phase 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we renamed the "phase 4 acceptance criteria" to the "portability criteria", would that make it clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be more clear that way, I think. Oh wait, but this removes explicit 'multiple implementations' requirements from the old doc - that was hard to see, the diff seems to be mixing sections of the doc. Is it implied or not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now performed the rename, here and in #550.
Wrt your edit, I've also now re-introduce "two or more implementations" language. That was previously only an example, but it does seem worth having, even if we have to leave it up to proposals to define what kinds of implementations are needed.
Contributing.md
Outdated
|
||
Entry requirements: | ||
|
||
* The phase 4 acceptance criteria must be either met or there must be a plan for how they're expected to be met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be phase 3? (this is inside 'phase 3' section of the doc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's written as intended; see my comment above.
actually, it's sometimes considerably more expensive as it involves malloc. |
if you moved the discussion to elsewhere, give me a pointer to the new place. thank you. |
@yamt The places where the ABI does a malloc fall into two categories: there are some malloc calls that we haven't yet optimized yet but will, and there are some malloc calls in areas that have no witx equivalent. |
which category does eg. fd_read (besides malloc, it lacks of iov) fall into? |
when you say "haven't yet optimized yet but will", do you mean adapter functions? |
We haven't optimized it yet.
The malloc call can be optimized away by changing how the bindings are generated. iov functionality would require adding the feature to the canonical ABI spec, but it's doable. |
ok. do you have any idea when/if such optimizations can be made? at core wasm level, how do such optimized versions look like? |
Recap of some of the discussions that I think would be useful to share:
|
In the interests of visibility I wanted to share and get some feedback on two key items, which should be considered in relation to this PR, that is performance equivalence and the parallel life of WIT and WITX. Both of these help to address the concerns expressed in this thread and related discussions. Performance Equivalence Justification / Mitigation of Concern A concern would be these newer language specific primitives being used to define interfaces upon which existing C code is heavily dependent. A socket interface is a great example of this. If a socket interface were to be implemented using At the same time it is important not to rule out innovation, and technology advancement, both in the runtime implementation and in the realization of the WASI standards themselves. So rather than blanket rejecting of new concepts, like The implication of course, may be, with regard to the malloc discussion between @sunfishcode and @penzn that the adoption of a WIT implementation would be dependent on showing no regression on performance between a witx implementation and a wit implementation. Some data here to validate this may be required? The resulting performance checks, may result in performance observations in a number of the key languages WASM targets. C, Rust, Go ? - Others ? I know this puts an additional onus on those proposing standard changes, and that this may slow standard evolution, but this, perhaps is justified, since as the WASI ecosystem matures the cost of interface change increases, and rapid, performance impacting changes will drive away adoption in general. The Parallel life of WIT and WITX Justification / Mitigation of Concern This may extend the life of WITX; but it provides an important runway for WAMR in particular. There are WAMR users with 100,000s of individual devices deployed in the field. The rate of runtime change is considerably slower in the IoT and embedded world, than in the cloud / data center environment. Aside from the practical implications of deploying the an updated runtime, there is a need to allow time for the runtime's own evolution to support the component model, and to ensure it can continue to execute the same functional payload with the existing hardware specifications (see, performance above). A concern would also be that a move to a WIT only world at this point in the WASI journey, may result in standards only being proposed and considered by the subset of the community actively engaged in WIT compliant runtimes. As those not working on runtimes which can adopt the standards would struggle with the context necessary for active participation in the conversation, and wouldn't be able to propose standards which they themselves could implement.... at least until we've wider support... Taken together, I think these two suggestions help to address a number of concerns which impact the embedded and IoT world. Thoughts? |
My final 2¢, I don't meant to be gloomy or delay adoption of Component Model in WASI, I just think this should be shared for awareness. I think this change, with mandating component model, as opposed to being 'just an external API', presents a qualitative change, however minor, where WASI would need some core features. There are two reasons for it in my view:
This is concerning, because so far there is not much traction in supporting component model or other core features mandated by this change in browsers (treads draft hasn't been presented yet, for example). This would create divergence between WASI and 'stock' web environment, while more convergence would be preferrable in my opinion, as there are currently some challenges with running the same code both ways. Maybe a compromise would be to allow WIT and WITX coexist for the time being, at least this way exploration of component model can continue while maintaining backwards-compatibility with the existing WASI approach. |
(sorry for the slow reply due to holiday + travel) Thanks for the thorough writeup @woodsmc. For my part, I agree on both your broader points and suggestions. As for how I think we could go about concretely integrating these into the docs and process: For the ‘Performance Equivalence’ point, while I agree that our goal is to ensure that Preview 2 doesn't regress performance (after all, performance is one of the main motivating factors for using wasm in the first place), I think it’s important that we don’t choose a fragile evaluation criteria or one that prevents iteration and real-world feedback. Having worked on benchmarks for some years before, it’s surprisingly easy to write well-meaning benchmarks that completely misrepresent real-world performance and lead developers in the wrong direction. Additionally, just by nature of older code paths having received more optimization, I think it’s important to distinguish between any temporary regressions that may occur due to newness and lack of optimization vs. essential performance differences that indicate roadblocks to better performance in the future. Lastly I think, at this point in time, it's really important to ship our next iteration of WASI this year and fix any lingering performance issues in the next iteration next year. Based on all that, my suggestion is that we add a “Performance goals” subsection to preview2/README.md (added in #550) that says something to the effect of:
For the ‘Parallel Life of WIT and WITX’ point, agreed and perhaps we could add a “WITX” section to the current WitInWasi.md that describes how .witx files can be derived from .wit files according to the Canonical ABI and how wasm engines can implement single-module components using just these derived .witx files and their existing WITX machinery. Does that sound reasonable? |
@penzn Agreed that we should generate WITX from WIT (via the Canonical ABI) to help developers transition. However, to the wasi-threads point, I think threads are wholly independent: the problem with wasi-threads is the O(MxN) function table space usage it implies (assuming M threads and N functions) and the consequent problems for dlopen()-style dynamic linking, not the component model. If it weren't for these problems, it would be easy enough to add a |
Do you have a link to a discussion/notes where this was stated previously? I can see a couple prior discussions where it wasi either implied or directly said that threads and similar APIs are ultimately incompatible with Component Model. For example in WebAssembly/wasi-threads#48 (comment): Even if the issue of threads was completely decoupled from component model, then it would be an additional instance of needing core a wasm feature in WASI, instance-per-thread multithreading is used on the Web today. |
There's been a lot of discussion on this thread, and in many other venues, on this topic. We have pushed the vote on this back a month so that these discussions can continue. We believe that all of the key concerns are resolved enough that we can hold a vote tomorrow for this PR to land. So this is a last call for tomorrow's vote - if you still have concerns that would warrant a "no" vote, can you please let us know today? Its also ok to say so at the meeting / vote itself, of course. |
component model is often used to link core instances and it in some cases needs to involve function table, thus has problems wrt wasi-threads similar to dynamic linking, doesn't it? |
my understanding is that such an optimization will be an ABI breaking change at core wasm level. |
I've now added commits to #550 to add the content of the discussion about performance and witx and wit. |
This PR was approved by unanimous consent in the 9/07/23 WASI subgroup meeting https://github.com/WebAssembly/meetings/blob/main/wasi/2023/WASI-09-07.md |
Phase 4 Advancement Criteria got renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document. Also, this document never got the portability criteria filled in, but we have assigned it the same criteria as was filled in for wasi-poll, which got merged with this package in #46
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
This is to align language in the WASI phase process with all pre-existing WASI repos. Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document.
* Update README.md Phase 4 Advancement Criteria got renamed to Portability Criteria in WebAssembly/WASI#549, so rename it in this document. Also, this document never got the portability criteria filled in, but we have assigned it the same criteria as was filled in for wasi-poll, which got merged with this package in #46 * Update README.md * Update README.md
The CG Phase Process document has recently split out the entry requirements for each stage from the activities that happen within each stage, fixing an ambiguity about what happens before a stage and what happens within a stage. It also contains a number of generally useful updates.
This PR updates the WASI Phase Process using wording derived from the CG Phase Process, adapting it to meet WASI's needs. The resulting process is roughly the same as the existing process, however I've made it more specific in a few areas: