Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stay reasonably neutral in problem statement #25

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

dcodeIO
Copy link
Contributor

@dcodeIO dcodeIO commented Sep 15, 2022

Given the long history of controversy around the topic, I think the problem statement reads somewhat tendentious and thus can be misunderstood as unalterable facts, which in part is not the case.

In particular, the Component Model's choice to only support USVString is called a requirement herein, even though the Component Model is merely in phase 1, is about to be formally objected to with the next phase advancement, and could trivially support DOMString as well (adhering to precedents like WebIDL, JSON, the many language standards etc.) so Java-like languages are not unnecessarily confronted with friction when interfacing with JS, themselves or each other. Some people, including myself, consider the Component Model's choice as arbitrary for this reason, as it really inserts conceptional discontinuity in between two things that would otherwise work just fine if DOMString was supported, which would significantly reduce the scope this proposal has to cover.

I hence propose to remain neutral in the problem statement, including that prior discussion is taken into account and people don't have a reason to feel too strongly over the reasoning.

@dcodeIO dcodeIO mentioned this pull request Sep 15, 2022
11 tasks
README.md Outdated Show resolved Hide resolved
Co-authored-by: 1zizi1 <56330238+1zizi1@users.noreply.github.com>
README.md Outdated Show resolved Hide resolved
@dcodeIO
Copy link
Contributor Author

dcodeIO commented Sep 16, 2022

Thinking a bit more about how to deal with differing strong opinions in a neutral way, I think there are two valid options:

  • Balance: Briefly summarize both opinions if the context is considered noteworthy.
  • Abstention: Do not mention either opinion and phrase very carefully so nothing is implied.

I've tried to adhere to these in my latest edit and am looking forward to your suggestions if you think there are further improvements.

README.md Outdated

The WebAssembly [Component Model](https://github.com/WebAssembly/component-model) requires well-formed strings, as do some compile-to-JS programming languages, many data encodings, network interfaces, filesystem interfaces, etc. Interfacing JavaScript strings with such APIs is a common use case that therefore suffers from conversion burdens. In particular because conversion from `DOMString` to `USVString` is lossy (common options are to replace unpaired surrogates or to throw an error) there is a regular need for string validation both within the platform and for certain userland use case scenarios.
The authors of the WebAssembly [Component Model](https://github.com/WebAssembly/component-model) have, as part of the WebAssembly CG's decision-making process to advance the Component Model [to Phase 1](https://github.com/WebAssembly/proposals), achieved consensus under notable opposition (including a [formal objection](https://www.assemblyscript.org/standards-objections.html#component-model-2022-09)) to only support `USVString` on WebAssembly component boundaries. The authors of the Component Model consider Unicode text to be the best design for their users, whereas the concern expressed is that support for Java-like languages and safely interfacing with JavaScript and Web APIs is not covered by the Component Model. The authors of the Component Model have chosen the replacement strategy, so interfacing `DOMString`s with WebAssembly components, including their envisioned ESM-integration, will be subject to silent mutation of string data. Independently, there is a significant number of data encodings, network interfaces, filesystem interfaces, compile-to-JS programming languages, etc. designed for Unicode text (`USVString`) rather than `DOMString`. Therefore, interfacing JavaScript strings with such APIs is, also given the preferences of the WebAssembly CG, an increasingly common use case that suffers from the outlined conversion burdens. To detect respectively address the side-effect of silent data mutation or thrown errors early, there is a regular need for manual string validation and sanitization both within the platform and for certain userland use case scenarios. This proposal addressses both detection and sanitization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the minutiae of the wasm descision-making process matter here? It’s clear that you object to this decision in wasm but readers here shouldn’t need to know or care that anyone objected.

Copy link
Contributor Author

@dcodeIO dcodeIO Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the context is important here, because, since it is directly connected, it increases the surface area this proposal has to cover, and in part motivates the timing to pursue the proposal at this point in time. In particular, if the Component Model would revise its decision and allow passing DOMStrings (that are also Java, Dart, Kotlin, C#, etc. strings) over WebAssembly boundaries (i.e. to each other and JS / Web APIs) without eagerly mutating them, the decision would impact the scope of the problem statement. One critical aspect is, as I think, that the Component Model is on track to be used for ESM integration (hence mentioning), which would introduce the complications this proposal addresses for the entire platform when JavaScript modules are transparently replaced with WebAssembly modules, either directly or in dependencies. Do you have ideas how to integrate these relevant aspects in a form that you'd be OK with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usefulness of this proposal exists even if wasm wasn’t part of the discussion.

I think documenting wasm’s decision is sufficient, and zero context about the decision besides maybe a link is both sufficient and desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I would agree that the proposal can be considered useful independently, all of timing, reasoning and extent of performance considerations in browser implementations are directly connected to the new situation introduced by the Component Model, as it significantly adds to scope and urgency. To illustrate, I contend that there is an alternative time line where userland implementations would be considered sufficient, say if the WebAssembly CG had more thoroughly addressed the far-reaching concerns and decided differently. The extraordinary platform complications, and why they will exist, certainly deserve a mention in my opinion. Open to restructure if you have ideas how to integrate these important aspects in a form that is more desirable to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that merely stating the wasm semantics with zero of the history is sufficient to convey that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly recognize that there is a tension here, that, as it seems, revolves around a preference that little to zero context is provided. On the one hand, I would argue that a felt sense for zero context has little weight, and on the other that similar lack of context has led to the Component Model becoming unnecessarily suboptimal for many popular programming languages and, here in particular, JS / Web API interop affecting ECMAScript as well, now extending the complications to the entire platform. From what I've witnessed, bad outcomes are best prevented by providing related context, better more than too little, so interested parties are well-informed before decisions are made. And in this case, the context helps to answer important questions: "Why even?" and "Why now?"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the "why" of WASM's decision does not matter here. TC39 proposal readmes aren't a pulpit for WASM opinions :-)

WASM decided. Therefore, that's the world we live in, whether it's better or worse is irrelevant. The extra (quite editorial content, because "neutral" is an editorial decision also) is just distracting to readers here.

@michaelficarra
Copy link
Member

@dcodeIO I think you're overthinking things. The purpose of this explainer is to make a compelling argument for the utility of the proposed features. The level of nuance you're looking to include here doesn't seem like it would help with that, and may, in its verbosity, actually take away from it. I'm willing to make the changes we discussed above if you would still prefer that to the current text. Otherwise I'm inclined to stop trying to resolve this concern and either leave the README as-is or remove all mentions of wasm/wasi. I'm confident that this proposal is sufficiently motivated even in their absence.

@dcodeIO
Copy link
Contributor Author

dcodeIO commented Sep 16, 2022

Please allow me a little more time to figure out if I can come up with a different structure respectively a shortened account, as I am so far not seeing any suggestions or discussions that would contribute to the title respectively intent of the PR. An alternative approach might be that additional context is moved to the FAQ, so related questions are reasonably answered. Would this be something you'd be comfortable with?

@michaelficarra
Copy link
Member

@dcodeIO Sure, let me know when you'd like another review, but please keep in mind the purpose of the explainer.

@@ -70,3 +70,7 @@ Performance optimizations are up to implementations and are not guaranteed by th
### Are consumers going to do anything other than convert when they encounter ill-formed strings? If so, why not provide only a conversion method with a fast path for well-formed strings?

Consumers may want to throw/error when encountering ill-formed strings. Also, consumers may want to defer the conversion or the error until later when the String is actually interpreted as Unicode text. These use cases justify the test-only method.

### How does this proposal relate to the WebAssembly Component Model?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems fine to me now, if this entire superfluous section is removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several direct connections to this proposal in this section of the FAQ (that I moved there from the problem statement to do ya'll a favor btw). In addition to explaining why the CM is probably the most notable precedent for this proposal, why timings overlap, why this is special, why performance is important, etc., the paragraph also aids future discussion. Today, this proposal uses the CM as an argument, and it is likely that tomorrow the CM will use this proposal as an argument to bolster its choices because "but you can check". This is highly relevant context that, if withheld for editorial reasons, will not be obvious and likely lead to worse results overall than if it was provided. Please let's be responsible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a JS developer, I should not need to understand or read anything about wasm beyond "wasm requires well-formed strings". None of the info about the component model is relevant to me. A TC39 proposal is not an appropriate place to create leverage for an argument within wasm.

Copy link
Contributor Author

@dcodeIO dcodeIO Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not intended as leverage, really. If you can point me to sentences that feel like it to you, please tell me, and I'll do my best to phrase as objectively as I humanly can. Hardly anyone knows about the nitty gritty details in play here (this proposal helps to some extent), even less people know how all this is connected, but exactly this knowledge is important, now more than ever with the CM doing things differently than the platform has done it before. As a JS developer, you should care because you have dependencies, and these dependencies have dependencies, you upgrade them, you bundle them. This string stuff is subtle, but also fundamental, and errors are delayed, almost impossible to spot. With the CM, JS developers need a bunch of highly specialized knowledge now that they didn't necessarily need to care about before. I do think that we are acting responsibly when we try our best to let them know about the what and why, so they can address it, either in code or in the spec. Please give me a realistic chance to improve this FAQ section. I think it's important that people are aware. That there is discourse about this, knowledge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the place that awareness should be spread is a wasm arena, not a JS one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, this is about helping JS devs. To help TC39. ECMAScript. I am out of options in Wasm, nobody cares there :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly the point - if wasm's decisions here aren't going to change, why is this extra context useful to JS devs? All they need to know is what a well-formed string is, how to check for it, and how to sanitize one if needed - which this proposal already provides for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the CM is in phase 1, it's too early to know for sure whether Wasm's decisions are going to last. And as I said in my prior comment, JS devs, in particular those intending to consume Wasm modules, even more so those with an interest to use JS and Wasm in tandem (which is a fantastic use case btw and one of Wasm's stated goals), should care:

As a JS developer, you should care because you have dependencies, and these dependencies have dependencies, you upgrade them, you bundle them. This string stuff is subtle, but also fundamental, and errors are delayed, almost impossible to spot. With the CM, JS developers need a bunch of highly specialized knowledge now that they didn't necessarily need to care about before. I do think that we are acting responsibly when we try our best to let them know about the what and why, so they can address it, either in code or in the spec.

If they are kept in the dark, this will just remain the self-fulfilling prophecy it already is: Wasm does its thing, while all attempts to inform or include JS devs are stonewalled. Greetings to the same Wasm folks up- and downvoting here btw. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If wasm makes decisions that make JS interop harder in practice, that hurts wasm devs, not JS devs, because that will just inhibit use of wasm. I don't see what JS devs can do about it - nor do I think that the entry to the wasm funnel will be on this particular TC39 proposal.

Copy link
Contributor Author

@dcodeIO dcodeIO Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound like a good outcome, rather like an argument to better have an FAQ entry so nobody is hurt unknowingly. I think there's a lot JS devs with an interest for Wasm can do btw. (I'm one of them, and look at me), just not if relevant context is withheld from them. And for sure this proposal is the entry point: Want to know what the problem is? Direct here. Ran into an issue? Direct here. Component Model? Soon directs here. In a sense, JS and Wasm meet exactly here, where the problem is addressed with manual checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants