Ensure the JS thread doesn't have to get hit when piping two streams to each other #97

domenic · 2014-03-11T19:46:41Z

E.g., an XHR stream to a file system stream, xhrStream.pipeTo(fsStream).

As-is, there might be a problem if e.g. the user does

var oldWrite = fsStream.write;
fsStream.write = function (chunk) {
  console.log("AOP logging!!", chunk);
  oldWrite.apply(this, arguments);
};

after the pipe is already set up. The implementation would have to detect this and start routing data to the main thread, which kind of sucks.

A possible fix would be caching the value of dest.write at the top of pipeTo.

/cc @sicking

The text was updated successfully, but these errors were encountered:

sicking · 2014-03-11T21:23:05Z

One important question is if we actually want to support code like the above?

My concern is that it makes it very easy to create footguns where you suddenly get dramatically slower performance for something that is seemingly a harmless no-op.

For example

var oldWrite = fsStream.write;
fsStream.write = function (chunk) {
  if (debug)
    console.log("saved", chunk);
  oldWrite.apply(this, arguments);
};

or even

fsStream.write = fsStream.write.bind(fsStream)

I don't know the answer. But it's not obvious to me that it's beneficial that overriding write affects pipeTo.

domenic · 2014-03-11T21:31:19Z

The issue is that if pipeTo doesn't go through a public API like write then it is impossible for users to create streams themselves; they can only use platform-provided streams.

tyoshino · 2014-03-13T09:56:34Z

users to create streams themselves

This is important, but I don't think we need to enable replacing write of dest with ongoing piping.

We should also clearly specify how pipeTo determines whether to use such an offloaded/platform piping or call JS function.

domenic · 2014-03-13T11:33:33Z

The base pipeTo should always follow the specified algorithm; subclasses can always override it with more specialized versions.

tyoshino · 2014-03-13T14:45:37Z

The base pipeTo should always follow the specified algorithm; subclasses can always override it with more specialized versions.

I meant how to check if dest can receive piped data out of JS thread or not. If dest's write is overridden by JS code, it needs to be run in JS thread. Otherwise, we could do some optimized offloaded pipe depending on dest's type.

domenic · 2014-03-13T14:50:56Z

Well, my understanding of @sicking and others' ideas was that if you did certain special combinations, e.g. xhrStream.pipeTo(fsStream), then XHRStream.prototype.pipeTo would recognize it's dealing with an FSStream, and do off-main-thread piping work. That is, something like

XHRStream.prototype.pipeTo = function (dest, options) {
  if (dest.[[InternalStreamType]] === "file") {
    cppBindings.pipeTogetherOffMainThread(this, dest, options);
  } else {
    ReadableStream.prototype.pipeTo.apply(this, arguments);
  }
};

domenic · 2014-03-13T14:56:12Z

Er, wait, I realize that doesn't address what you're talking about. Yes, I agree, determining clearly what the criteria are is important... the spec should provide some hook other specs can use so that they can do e.g.

XHRStream.prototype.pipeTo = function (dest, options) {
  if (dest.[[InternalStreamType]] === "file" && isUndisturbed(dest)) {
    cppBindings.pipeTogetherOffMainThread(this, dest, options);
  } else {
    ReadableStream.prototype.pipeTo.apply(this, arguments);
  }
};

Here isUndisturbed is the hook. It wouldn't be an actual function (I don't think?), but just an algorithm other specs could refer to.

This is all of course assuming our solution is indeed of the caching-write flavor described. Other solutions would be some kind of double-dispatch, which we could conceivably even bake into the base spec... e.g.

ReadableStream.pipeTo = function (dest, options) {
  if (dest.pipeFrom(this)) {
    // successfully hooked up at the OS level; all done
  } else {
    // the dest had no special way of hooking itself up to `this`, so do the usual algorithm
  }
};

tyoshino · 2014-03-14T09:36:08Z

Right. Thanks.

One more point I'd like to clarify is how dest's state getter should work during offloaded pipeTo() is working. I guess the right value to return in such a case (not yet closed) is basically "waiting".

ferasm · 2014-04-02T16:08:20Z

I like the concept of isUndisturbed.

I also agree it would not be an exposed function, and would like to have it be well defined in the spec - I can see other specs needing this behavior documented and will be able to reference this spec for a definitive algorithm.

@tyoshino - I don't quite understand why the state getter should behave differently?

tyoshino · 2014-04-02T16:23:45Z

@ferasm state getter is the only synchronous (closed getter is also, but what it returns is the constant reference to the [[closedPromise]] instance) method on BaseWritableStream. So, I wondered how it should behave while pipe algorithm is running asynchronous to the JS thread. We should also discuss how the other methods (except for closed. It's clear) should behave but I was especially concerned with state.

othiym23 · 2014-04-04T21:26:10Z

/cc @Gozala

tyoshino · 2014-10-28T07:51:37Z

@yutakahirano pointed out that not only for pipeTo() but we want similar functionality for the constructor of Resource of the Fetch API. If we add ReadableStream to the RequestInit, we would want to prohibit read()-ing from the passed ReadableStream. I.e. giving exclusive access to the Request object.

See also w3c/ServiceWorker#413 and w3c/ServiceWorker#452

sicking · 2014-10-28T18:43:31Z

I also think that we really need the ability to send a ReadableStream or WritableStream to another thread using worker.postMessage({ theStream: stream }) or port.postMessage({ theStream: stream }).

In that case too we should make sure to not hit more threads than needed.

tyoshino · 2014-10-29T03:48:27Z

@sicking Is what you're proposing a Streams API based pipe for communication between workers/windows?

sicking · 2014-10-30T06:21:57Z

I'll avoid using the word "pipe" since it means different things to different people.

What I mean is that if I get a ReadableStream or WriableStram from something like writablestream = filesystemDirectory.writeToFile('filename'); or readablestream = fileReader.readAsStream(); or xhr.responseType = 'stream'; xhr.send(); readablestream = xhr.response I would expect to be able to pass that stream to a different thread. And that I could do that using port.postMessage(readablestream/writablestream). And that that data read or written would go directly to/from the worker thread to the filesystem or network.

tyoshino · 2014-11-04T12:49:19Z

@sicking Is the ReadableStream neutered in the original scope? What I have in my mind is something like socketpair(2). We call e.g. worker.createByteStreamPair(). It returns a WritableByteStream, and the worker receives an event with a ReadableByteStream attached. These streams are connected. Then, you do as follows:

In the main thread:

var fileStream = fileReader.readAsStream();
var streamToWorker = port.createByteStreamPair();
fileStream.pipeTo(streamToWorker);

In the worker thread:

self.addEventListener('message', function(e) {
  var streamToMain = e.data;
  // do something
}, false);

sicking · 2014-11-04T18:19:00Z

@sicking Is the ReadableStream neutered in the original scope?

Yes. Just like with any other transferable objects.

What's the advantage of the createByteStreamPair() approach? It seems more awkward to me, but maybe I'm missing some advantages? Should we do the same thing for MessagePorts? I.e. would there be advantages to adding something like port.createPortPair() so that people could use that rather than sending MessagePorts over postMessage?

sicking · 2014-11-04T18:56:06Z

And let me know if this discussion should maybe be done as a separate issue?

tyoshino · 2014-11-05T04:53:46Z

advantage of the createByteStreamPair()

Sorry, I hadn't give it much thoughts yet. A little more study follows:

The streams are so sophisticated than ArrayBuffer (just buffer. no state), String, Blob (immutable. no state), etc. I think we don't want to accept general streams. But even byte streams, string streams, etc. have states. In postMessage(readableStream) approach, do we need to pickle readableStream in the original scope and restore it in the new scope? It complicates, and I thought we don't have to do that and it's sufficient if we can connect windows/workers with back-pressured channel. Then, what should postMessage(readableStream) do?

Neuter references to readableStream. Create a bridge that reads from readableStream via the ReadableStream interface. Read data is cross-thread transferred to an object with the ReadableStream interface in the destination scope.
Do something special intruding the ReadableStream interface. E.g. for a stream obtained from XHR, detach TCP/TLS endpoint (file descriptor) from the stream and pass it to the destination scope and create a new object with the ReadableStream interface with the descriptor.

Thinking about (1), as we have pipeTo(), I thought we could just provide a way to generate a connected pair of ReadableStream and WritableStream and pass the read-side directly to the destination scope to avoid use of postMessage(readableStream) which implies serialization of the stream.

If we want to realize (2), we need to take postMessage(readableStream) approach.

tyoshino · 2014-11-05T04:58:19Z

And let me know if this discussion should maybe be done as a separate issue?

Created. #244

domenic · 2015-03-12T09:24:09Z

Quick update on this (old) issue: this is one of the highest priorities of the current design, pioneered with the stream reader concept. The only missing piece is a similar stream writer for writable streams. Once we have that in place this issue is done. We are very much on track though and have been designing the system around this goal.

tyoshino · 2016-06-23T08:19:39Z

The only missing piece is a similar stream writer for writable streams.

This is happening at #462

Related to #359, #325, #97, #321, #461

domenic added the implementer concern label Mar 11, 2014

This was referenced Apr 21, 2014

Can we add buffering strategies via a transform stream? #24

Closed

Transform backpressure approach 2 request for comments #110

Closed

domenic mentioned this issue Jul 17, 2014

pipeFrom to allow writable streams to "pull" at their own pace #146

Open

tyoshino mentioned this issue Jul 23, 2014

Reading binary into an existing buffer #111

Closed

tyoshino mentioned this issue Aug 18, 2014

Does FetchBodyStream need to be split w3c/ServiceWorker#413

Closed

This was referenced Oct 28, 2014

Backpressure on fetch integrated with Streams w3c/ServiceWorker#452

Closed

Should we "lock" readable streams while piping? #241

Closed

tyoshino mentioned this issue Nov 5, 2014

Communication between workers/windows via Streams API #244

Open

domenic mentioned this issue Mar 21, 2015

Performance Check vs. Node Streams #304

Open

domenic mentioned this issue Apr 7, 2015

Port locking infrastructure to writable streams #319

Closed

domenic mentioned this issue Nov 24, 2015

How generic is pipeTo? #407

Closed

isonmad mentioned this issue Jun 2, 2016

can streams be transferred via postMessage()? #276

Closed

tyoshino added a commit that referenced this issue Aug 30, 2016

Add a proof of concept of optimized pipe

0dd571c

Related to #359, #325, #97, #321, #461

tyoshino mentioned this issue Aug 30, 2016

Add a proof of concept of optimized pipe #511

Open

domenic closed this as completed in fd44398 Oct 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure the JS thread doesn't have to get hit when piping two streams to each other #97

Ensure the JS thread doesn't have to get hit when piping two streams to each other #97

domenic commented Mar 11, 2014

sicking commented Mar 11, 2014

domenic commented Mar 11, 2014

tyoshino commented Mar 13, 2014

domenic commented Mar 13, 2014

tyoshino commented Mar 13, 2014

domenic commented Mar 13, 2014

domenic commented Mar 13, 2014

tyoshino commented Mar 14, 2014

ferasm commented Apr 2, 2014

tyoshino commented Apr 2, 2014

othiym23 commented Apr 4, 2014

tyoshino commented Oct 28, 2014

sicking commented Oct 28, 2014

tyoshino commented Oct 29, 2014

sicking commented Oct 30, 2014

tyoshino commented Nov 4, 2014

sicking commented Nov 4, 2014

sicking commented Nov 4, 2014

tyoshino commented Nov 5, 2014

tyoshino commented Nov 5, 2014

domenic commented Mar 12, 2015

tyoshino commented Jun 23, 2016

Ensure the JS thread doesn't have to get hit when piping two streams to each other #97

Ensure the JS thread doesn't have to get hit when piping two streams to each other #97

Comments

domenic commented Mar 11, 2014

sicking commented Mar 11, 2014

domenic commented Mar 11, 2014

tyoshino commented Mar 13, 2014

domenic commented Mar 13, 2014

tyoshino commented Mar 13, 2014

domenic commented Mar 13, 2014

domenic commented Mar 13, 2014

tyoshino commented Mar 14, 2014

ferasm commented Apr 2, 2014

tyoshino commented Apr 2, 2014

othiym23 commented Apr 4, 2014

tyoshino commented Oct 28, 2014

sicking commented Oct 28, 2014

tyoshino commented Oct 29, 2014

sicking commented Oct 30, 2014

tyoshino commented Nov 4, 2014

sicking commented Nov 4, 2014

sicking commented Nov 4, 2014

tyoshino commented Nov 5, 2014

tyoshino commented Nov 5, 2014

domenic commented Mar 12, 2015

tyoshino commented Jun 23, 2016