-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stream-whatwg: add whatwg streams #22352
Conversation
bc166d7
to
0755fea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m very skeptical about adding this feature to Node.js. There is a lot of momentum and compatible APIs in the Node.js ecosystem about Node Streams (with their problems). Adding another implementation might not be in the best interest of the runtime.
What is the goal? Should we look for interoperability with whatwg streams? Can you link the other PRs or topics you needed these for?
Tagging as semver-major because it adds new globals.
Can you please remove the git submodule?
(I’m putting a -1 for this not to land without a long discussion).
i thought about this a bit. the conclusion i came to is that it would certainly be weird, but i couldn't think of any actual problems here that we haven't hit before. the biggest issue would be the confusing of having multiple interfaces which is obviously a valid issue but i think whatwg streams have benefits that outweigh the confusion. as a parallel, we ship two different versions of URL in core as well. in some future i would like to imagine that we only really use whatwg streams, the same way we've mostly transitioned to whatwg urls.
to add whatwg streams to enable other various features down the line, such as creating interfaces in node which would be compatible with webassembly streaming or implementing isomorphic http interfaces. if we did implement fetch it would also give us the response object which could be used in the esm loader to pass data around, although that idea hasn't really gone beyond an idea. there aren't really any prs to link because no one's tried to implement this stuff yet. it is also just an excellent interface that users of node can benefit from having exposed.
why? how should i bring wpt into the repo? |
All of the the most used features of Node.js are based on Node Streams. This will likely double the side of our API in a lot of areas: What is the whatwg version of net.Socket? tls? http? Reimplemeting all the above is a lot of work and a lot of disruption for the community. In comparison, the URL change was tiny, and we used it very little in node core. Streams are used everywhere, and some of the semantics goes down to C++.
We vendor (copy) our dependencies. |
I will be great to have whatwg streams in node at least for compatibility reasons 🎉 |
I share @mcollina's concerns about this and def believe this belongs in userland. |
0755fea
to
2514c91
Compare
@mafintosh at the very least a situation where we allow people to pass whatwg streams into node apis would be a win for me. i'd like to actively pursue features like fetch and wasm streaming being in core. @mcollina fwiw whatwg streams are actually a lot safer and easier to use than node streams from c++ because of the reader locks, separation of stream events and stream producer events, etc. chromium has some nice apis here for interacting via c++ that i think we could take advantage of. |
also after reading through the whatwg stream spec and node stream implementation, it looks like it would be safe to either make node streams extend whatwg streams, or if that makes people uncomfortable i think something like nodestream.acquireStandardStream() would be usable |
This seems to be importing the entire WPT? Including those that don't make sense for us? |
common.gypi
Outdated
@@ -24,6 +24,17 @@ | |||
|
|||
'openssl_fips%': '', | |||
|
|||
'v8_extra_library_files': [ | |||
'./lib/v8_extras/ByteLengthQueuingStragety.js', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in file name... s/ByteLengthQueuingStragety/ByteLengthQueuingStrategy
Not yet convinced that this should be in core and definitely concerned about the massive import of WPT. One way we could make progress on this is to follow a development model much like we did with http2... that is, create an experimental fork repo, get this landed there, allow folks to iterate on it, then decide whether to pull it in. That said, this is something that could definitely be done as a userland module and we already have three versions of the existing streams API in core and we definitely need to have a decision around how to do this. Also, in general, I'm not convinced that using v8_extras is the right approach for this. I think there are some large thorny issues we need to address in the implementation and in terms of how this would land, so a separate working repository could really help to iron those things out. |
6d1a805
to
0f3b850
Compare
@joyeecheung @jasnell i did another trim of wpt to make it hopefully only contain interfaces we have in core.
do you mean developing intertop between node streams and whatwg streams? the whatwg stream implementation here is complete.
as i keep saying... i'd like to take advantage of the streams inside core. something i could have a pr for the day after this lands is wasm streaming. we would also be able to add Text{Encoder,Decoder}Stream interfaces. |
a7d2f2f
to
f6e3c30
Compare
@devsnek Maybe only include the stream tests in this PR, since the commit only says it adds whatwg streams? The diff seems to be much bigger than necessary to review. Also, does the test pass locally? For what I can tell at least the linter won't be happy, the WPT files should probably be placed in fixtures to avoid being linted. A lot of tests ending with |
@joyeecheung all the selected tests pass (you can look at the travis build to see the TAP output) |
8679b96
to
7227280
Compare
f615c9e
to
e0117e7
Compare
@@ -24,6 +24,17 @@ | |||
|
|||
'openssl_fips%': '', | |||
|
|||
'v8_extra_library_files': [ | |||
'./lib/v8_extras/ByteLengthQueuingStrategy.js', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these files consumed by the build? I see that gypfiles in deps/v8 references v8_extra_library_files
, are you relying on the compilation of v8 to grab these lib/v8_extra
js files and compile them in? That will be more awkward for node-chakracore to work with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what is happening. The V8 build compiles the v8_extras in. They are not loaded the same way as every other thing in core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can chakra not just put these files somewhere and eval them when a context is created? we do plenty of stuff elsewhere that chakra has to shim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, chakra shims a lot of things, but so far it has been native APIs (or js APIs). Not side effects of the build system. Sure, it can be done, but I'd much prefer not to.
Personally I also feel like this makes the build of node more complex to reason about, it is that much less of a dependency tree and more of a dependency graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanted to have these JS files live in lib
and be compiled to native, then my preferred approach would be to explicitly put that logic in node's build, not use a side effect of v8's build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MSLaguana are you saying to just put the list in node.gyp instead of common.gypi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the list, but the thing that consumes the list. The bit that takes the list of filenames and processes it to the point where the relevant build artifact is produced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the implementation, I'm not sure which aspects of it would be infeasible to implement as a normal module. Doing so would make it significantly easier to polyfill, would make it more likely that it could just be dropped in to readable-stream
, and wouldn't require any changes to the build system or any coordination between v8 and chakra-core. @devsnek ... can you explain a bit more about why you think such a port wouldn't be feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v8 consumes the list and turns it into a c++ file that is linked internally to the engine. chakrashim would need to do something similar, calling those extras functions when new contexts are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell because its a lot of stuff to rewrite (i don't have the time to do that) and i'd like to be able to collaborate with chromium by keeping at least similar implementations. readable-stream can use any of the polyfills on npm or even vendor the reference implementation from the streams repo.
That would also be possible with it in
I don't have to know that Assuming, for a moment, that we stay with the approach of introducing new methods on the const rs = socket.toReadableStream()
const ws = socket.toWritableStream()
const rs2 = anyReadble.toReadableStream()
const ws2 = anyWritable.toWritableStream() This would eliminate any ambiguity that would exist in that API and would eliminate the need to special case
Again, "nice" is entirely subjective here. I don't find the following to be particularly "nice"... const acquireStandardStream = Readable.prototype.acquireStandardStream
const rs = acquireStandardStream.apply(someArbitraryStream) But this is likely the pattern that I would need to implement in order to be certain that all things conforming and pretending to be The following is certainly no worse: const { toReadableStream } = require('stream')
const rs = toReadableStream(someArbitraryStream) You are absolutely right that there are lots of things that streams come from, including things that do not extend from In any case, I've pointed out these limitations in the current proposed design a couple of times now so won't continue to belabor the point.
Including such links in the documentation are fine, but they are not a substitute for proper and consistent API documentation. |
In terms of handling const http2 = require('http2')
const server = http2.createServer();
server.on('stream', (stream) => {
stream.on('error', console.log)
const rs = stream.acquireStandardReadableStream()
rs.pipeTo(process.stdout.acquireStandardWritableStream())
stream.respond()
const ws = stream.acquireStandardWritableStream()
ws.abort('no reason')
})
server.listen(8000, () => console.log('ok')) Running a client against this code causes the server to crash with the following:
The |
If I understand correctly, this PR tries to run the WPT by running the const harness = fs.readFileSync('path/to/testharness.js', 'utf8');
vm.runInThisContext(harness);
vm.runInThisContext('test(function() { assert_false(true) })'); // nothing happens
vm.runInThisContext('assert_false(true)'); // throws exception |
@joyeecheung you need to install test callbacks, you can see the ones I have in wpt.js |
@devsnek Indeed, sorry for not looking into the docs more carefully before! |
The first and most obvious take that I think will be pretty universally held is "whoa that's a lot of code". This adds and API that we already have with no clear benefit aside from being a different API. I'm presently not convinced the API is better than streams3. Let's do a bit of a run down. WhatWG Pros
WhatWG Cons
Other WhatWG notes
Uncertainties
Due to the relatively small benefit, as far as I can tell, I'm not sure what Node.js would be gaining here. Having more than one API could be confusing - which one do you pick if you are building things? Also, in a future supporting both, we are likely to run into the unfortunate problem of going back and forth with transforms between the two streaming methods, likely resulting in unnecessary queuing as both streaming APIs maintain different types of queuing internally. That seems pretty bad to me, and it seems favorable for Node.js development[1] to keep WhatWG streams to the web. As far as web-compat, we aren't a web browser, and so that should be largely irrelevant to us. We don't have to lock ourselves to web standards like browser do, and so we should be able to ensure development with Node.js in nice and works well for Node.js's uses without web standard impediment. (This is different than the language spec, I should note.) I know people seem to like "isomorphic" JS code, but in my experience non-simple isomorphic dependencies tend to either be unreasonably troublesome to use, or perform significantly worse than ones built to cater to either Node.js or the Web, respectively. [1] Ok sure, the more technical debt the more job security for support companies and consultants, but that's not what I'm here for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that userland is a better place for this, as I outlined above.
Additionally, on a purely PR and not feature level:
- GitHub breaks on the PR size, making it difficult to review. Do we really need all of this? I feel like we almost certainly do not.
- I don't think we should be vendoring v8_extras unless otherwise impossible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m very skeptical about adding this feature to Node.js. >>> There is a lot of momentum and compatible APIs in the Node.js ecosystem about Node Streams (with their problems). Adding another implementation might not be in the best interest of the runtime.
So much of the discussion here has been on "how good is this PR", which while useful does not, in my opinion, constitute the whole picture. I would invite you all to look at the bigger picture associated with this PR for a moment, rather than the PR itself; to think of the opportunities for growth that will be unlocked by this PR.
|
Unfortunately i cannot join the TSC meeting, and that post merit a long response in writing. Please postpone any decisions on the matter. |
Discussed briefly at TSC meeting. Removing tsc-agenda label. Please re-add as appropraite. |
This doesn't seem correct to me. You could use a WhatWG stream module instead. As far as I'm aware what is in this PR could be put in a user module and provide an interoperability point from Node streams to WhatWG streams. Do note that in either case you are probably going to run into unnecessary queuing, and that is no different without a wholesale move to WhatWG streams, which doesn't seem very realistic (or perhaps even desirable) to me.
I don't see any reason why we'd presently be preventing user module WhatWG streams. If you want your user module to do it, then go and do it. Node core is not actually stopping you here. The above caveats apply in any case, and this doesn't seem to make them any better. |
@Fishrock123 it is unsafe to depend on polyfills at a library level, if different libraries use different polyfills it can break things hardcore when you mix them. We also have at least two features within core (Encoding streams and compileStreaming) that can use this. |
This is a long response, I'm sorry about it. node-fetchIf you would like to take node-fetch to WHATWG stream, I'm up to help in building the modules and the ecosystem to help work well in the Node.js world. However this must involve a polyfill because of our LTS cycle as Node.js 8 is maintained for ~2 years. I think this is something we should resolve earlier than that. Some of the usage that I've seen for node-fetch is: global.fetch = async function(url, params) {
// External call, just use node-fetch
if (!url.startsWith('/api')) return nodeFetch(url, params)
// Local fetching, serve with fastify.inject
const response = await server.inject({ method: 'GET', url, ...params })
response.status = response.statusCode
response.statusText = response.statusMessage
return new nodeFetch.Response(response.payload, response)
} As a platform, we need to do a better job so that our users do not need to monkey patch the global object to make their code work. I think that we need a better pattern than pure I think a pure, spec-compliant and global WHATWG streams are never going to replace Node.js streams
There is a lot of baggage behind this. Several collaborators walked away from the project because of this and the not-so-great interaction between individuals. For these reasons mentioning WHATWG Streams stirs pushback immediately. Let's leave all of that in the past. If we consider WHATWG Streams are never going to replace Node.js streams as a design goal, we can start working towards improving the compatibility of Node.js with them. I'm happy to work with that assumption. WHATWG streams are easier to learn for new Node.js developersCan we improve the docs of Node.js Streams, considering that they are there to stay? Can somebody help?
I think the major source of that was fixed in #17979 and readable-stream@3. If there are more weird things let's talk about it again.
I envision a world where the streams complexities blurs in the background (but are available for experts), and there is a common-ground way of doing things just using language constructs, and easy, portable helpers. async function run (origin, dest) {
try {
const writer = buildWriter(dest); // needed because of emit('error')
for await (let chunk of origin) {
await writer.write(chunk.toString().toUpperCase());
}
await writer.end();
await finished(dest);
} catch (err) {
cleanup(origin);
cleanup(dest);
}
} or async function saveWebsite(url, dest) {
const response = await fetch(url)
// pipeline can pipe things between WHATWG Streams and Node.js Streams
await pipeline(response, fs.createWriteStream(dest))
} IMHO both approach are simple. I would prefer that we focus on shipping a runtime where that is feasible.
Node.js streams works in exactly this way, and it is a pretty similar concept: there is a public API and if you adhere to it things should work well. If the same concept does not apply to whatwg streams, let's open an issue on the their repo as this looks like a bug to me. As an example, native promises and async-await works well with promise libraries such as bluebird. ConclusionsI propose to spin up a team that has the following goals:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing a case in here for throwing more stuff into core when it could easily be done in userland. On top of that we have additional work going on around streams so I'd suggest adding more confusion to something that's in flux is a pretty bad idea.
if (mode === 'byob') { | ||
// TODO(ricea): When BYOB readers are supported: | ||
// | ||
// Return ? AcquireReadableStreamBYOBReader(this). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version doesn't even support BYOB readers...
In discussions, Trevor Norris tipped me off that WhatWG streams may not clean up resources correctly after errors when Tee'd (multiplexed). (Or perhaps even to the extent we'd like at all after errors?) So far I'm largely unable to tell though, the complexity of this whole thing is unreasonable. (Which is why I have been approaching stream APIs completely differently.) |
school is back up so i don't really have time to pursue this change anymore... if anyone else wants to use this diff feel free. |
This pr adds a whatwg stream implementation from chromium and a WPT test runner.
I was inspired to work on this specifically because of the WebAssembly streaming PR that was recently opened but I think these streams stand well enough on their own.
This provides the a piece of the puzzle for wasm streaming, fetch, and some things I'd like to see used in esm.
I realize that this isn't really in @nodejs/streams's plans and such so I'm marking this as discuss and in progress.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes