Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadableStreamDefaultController seems redundant #743

Closed
kentonv opened this issue May 10, 2017 · 6 comments
Closed

ReadableStreamDefaultController seems redundant #743

kentonv opened this issue May 10, 2017 · 6 comments

Comments

@kentonv
Copy link

kentonv commented May 10, 2017

Hi,

This is feedback on this API, after spending the day partially implementing it. It may be way too late for this kind of feedback, but I thought it would be better to post my thoughts than not post. Feel free to ignore me.

I've spent a lot of time in the past thinking about streams and the different interfaces they can have (e.g. when working on Protobufs and Cap'n Proto which care about this kind of thing). I'm impressed that this API seems to cover a lot of obscure cases that people often don't think about, e.g. push vs. pull, backpressure, producer-allocated vs. consumer-allocated buffers, etc.

However, the result feels overcomplicated, and I wonder if it could be simplified.

I wonder if the concept of a "controller" is redundant. I note that ReadableStream's "controller" is almost just a WritableStream, and I wonder if these could actually be unified.

Currently, ReadableStream's constructor takes an underlyingSource with three (optional) methods: start, pull, and cancel. What if instead of start and pull, it just provided pipeTo, with the same signature as ReadableStream#pipeTo?

The ReadableStream would not call this callback until some data is first requested. At that time, it can do different things depending on the request:

  • If getReader() is called, then the ReadableStream constructs a WritableStream that places chunks in a queue, which the Reader then pulls from. This WritableStream's write() method would return a promise which resolves when the queue is ready for more chunks (as evaluated using the queuing strategy, etc.), to provide backpressure. Having constructed this WritableStream, it is passed to the pipeTo callback that was given to ReadableStream's constructor.

  • On the other hand, if the ReadableStream's own pipeTo() method is called instead of getReader(), it can trivially call the pipeTo callback passing along the same target stream. The ReadableStream can then step out of the picture entirely! This is really nice because it avoids double-buffering, and it gives the original producer the opportunity to query the output stream with instanceof, possibly discovering a more-efficient way to implement the pipeline (a common thing to want to do, in my experience!).

The major thing missing from this approach is support for BYOB. I think this could be answered by just throwing byobRequest() onto WritableStream itself. Or, perhaps a bit more intuitively, the WritableStream interface could be extended with a method like nextBuffer() which returns a buffer to use, or null if the WritableStream doesn't care. The producer would then be expected to pass that buffer back into write() -- although they'd also have the option to ignore it and call write() directly. The implementation of ReadableStreamBYOBReader would detect this and perform the copy if necessary.

Thoughts?

@ricea
Copy link
Collaborator

ricea commented May 10, 2017

There is definitely some duplication between the ReadableStreamDefaultController API and the WritableStreamDefaultWriter API. The way I think about it is that they are specialised for different priorities: I assume that reading from a source is roughly side-effect free. I assume that operations can be retried, so we can be less sensitive about data loss. This is reflected in controller.enqueue() having no indication of success or failure. On the other hand, WritableStream is designed with extreme care to avoid data loss even when the developer is not particularly careful, and to make it easy for developers to be extra careful. This is reflected in the fact that writer.write() permits the writer to wait for the underlying sink to complete.

I am sorry for not responding to your points in detail. Hopefully someone else will fill in the gaps.

@kentonv
Copy link
Author

kentonv commented May 10, 2017

Hmm, I think I see what you're getting at.

It seems like ReadableStream doesn't know whether it wants specializations to be push-based or pull-based, and in trying to support both it gets complicated.

Normally, input streams are strictly pull and output streams are strictly push -- that is, after all, their defining difference. So the obvious/naive way to design ReadableStream would be to have its constructor to essentially take an implementation of Reader (or something close to it). getReader() could essentially return that object, while the other methods would be implemented in terms of it. That would then be a strictly-pull API.

So what about push? I think the main cases where you want a "push-based API for input" is when implementing a pull-based API would require a convoluted state machine. The usual case I run into is where, say, a ServiceWorker wants to generate a response body dynamically in script. It's much easier to write code for this that performs a series of write()s than to write code which responds to a series of read() calls.

I'm not sure this use case lines up with your "reads are non-side-effecting" model. I think that model really only aligns with true pull-based streams, not push-based code that is going into a queue to be pulled.

Maybe the right way to satisfy this use case is just to let people create a TransformStream that doesn't apply any transformation (effectively, just a pipe), and then write to the write end while passing off the read end to whatever needs it...

@tyoshino
Copy link
Member

We got similar feedback multiple times in past e.g. #102. @domenic should be best at explaining the rationale. I thought he's put it in the Requirements.md, but couldn't find.

@tyoshino
Copy link
Member

In addition to what @ricea commented, @domenic told the different goals of RS and WS when tried to make RS and WS completely dual where I attempted to make ReadableStream acknowledgable. The history of this discussion is captured by #253 and #324. In short, my understanding is WS is a queue of operations while RS is a queue of data.

@tyoshino
Copy link
Member

The ReadableStream would not call this callback until some data is first requested. At that time, it can do different things depending on the request:

I've been sketching how to optimize piping with identity streams in the middle at #511. Allowing skipping identities is yes good (#325). This can be realized even in the current design by extending the interfaces for sources/sinks, I think.

@kentonv
Copy link
Author

kentonv commented Jun 4, 2017

So, to make my feedback more concise:

It seems to me that if there were a function which constructs an RS/WS pipe pair, it could be used in all the same use cases where people would use the RS or WS constructors today. Given this, it seems to me that removing the constructors and instead providing a simple function to create a pipe pair would make the standard a lot easier to understand and to implement. Currently, I feel that the constructors are over-engineered.

It looks like @domenic addresses this head-on here: #102 (comment)

My understanding of his argument is that he's saying: "How does the make-pipe function construct the two objects that it returns? If they don't have constructors, then we're cheating, using 'C++ browser magic' to connect objects in a way that couldn't possibly be done in pure JS."

I don't buy this. I think it's reasonable to say: "The constructors are private implementation details." Or, better yet: "The objects returned are private subclasses of ReadableStream and WritableStream which override all of the methods." This is pretty standard OOP practice, so I don't see why you'd want to avoid it.

I feel like there's some design goal here that I don't understand, though, where the concept of an "abstract interface which cannot be instantiated directly" is considered inherently undesirable?

@domenic domenic closed this as completed Mar 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants