-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the model for ReadableStream to have async read() #288
Conversation
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253, we discussed about how the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the readInto(arrayBuffer, ...) method for byte streams must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a read() method derived from their readInto() method, then read() must also be async, across all readable streams. This is a slight usability upgrade for consumers, in some cases. However, it potentially costs more microtasks when multiple chunks of data would be available synchronously. In the process of updating the tests to reflect async read, they were given a number of small unrelated tweaks (e.g. to wording, or to eliminate some setTimeout(,0)s). TODO: - This commit eliminates ExclusiveStreamReader, but this was done in error based on mistaken assumptions. It will be reversed. - Almost none of the spec is updated. Examples.md was updated and the examples in the spec were updated, but none of the algorithms or non-normative notes.
@tyoshino I am actually having a really hard time getting ExclusiveStreamReader working here. I remember you were able to make it work quite nicely in #277 and was wondering if you saw a similar straightforward solution here. I'm not sure which commit is a better starting place either, 2ee6a77 or 35c3e5b; at first I thought a readPromise would be helpful but now I am not so sure. |
What semantics do you plan to give to the locking? I can come up with ...
Regarding release:
|
I liked the semantics we have in the current spec, where the stream presents itself as empty while locked, and the reader presents as a closed stream after released.
This works too right?
Of these I think I prefer 3 > 1 > 2.
Of these I think 3 is best and fits well with "the stream presents itself as empty while locked". |
I think in some platforms we can't cancel an async read operation. With such platforms 1 doesn't work. |
We don't need to cancel it though, just redirect the result to the reader's next read() promise instead of the already-returned read() promise from the stream. |
Interesting. Both 1 and 3 redirect pending read data to the reader, and the only difference is whether it rejects the result promise or keep it pending, right? Either way, detaching the given buffer is necessary. |
@yutakahirano oh oops, I hadn't though about how this works with readInto + exclusive readers. Let me reevaluate. If you have given fread() the array buffer, that might be problematic... I need to go stare at your suggestion about detaching more. |
OK so.
Hmm. Now that we have this detaching-returning-different-buffers thing, maybe async readInto is not the right model, and feedArrayBuffer or setAllocator would make more sense. It feels weird to supply an argument to |
Discussed with @tyoshino offline. We think introducing ByteSource and removing ExclusiveStreamReader is the right direction.
While a stream returned by
This greatly simplifies the situation. Regarding fetch, Body.body is a ByteSource. |
I really don't like that. pipeTo having exclusive access is a feature that should exist on all streams (byte streams or not). And people should not have to think about whether they're dealing with a byte stream or another stream if they want to just pipe it somewhere else. It seems like we could accomplish the same simplification if we gave up on having pipeTo be explainable in author code. That was actually the original idea of #241. |
Domenic, one question. We designed ExclusiveStreamReader to be almost indistinguishable from ReadableStream. This was IIRC to allow user of the reader not to be aware of that he/she is touching a reader, not the original stream. Does this mean that the reader should also have getReader() to create a nested reader? |
I think that's not quite right. Rather, we want users to be able to accomplish their tasks in full without having to switch back to the stream. That is why (in some iterations) we had a closed promise that just delegated to the stream itself. Otherwise anyone writing manual-reading code would have to remember to use the stream for closed and the reader for other operations. I don't think it needs to be a complete substitute though. So no getReader, and no piping. That is, the purpose of similar API is to reduce cognitive load, not to actually be interchangeable to arbitrary consumers. |
OK. So, just my misunderstanding. Thanks for confirmation. |
We are about to introduce a pool mechanism (or pending read ops, if you prefer async read) into ReadableByteStream. We should replicate the mechanism into ExclusiveReader and it will get more and more complex. When a user defines a custom stream, it should provide its own exclusive reader - Is it possible, given that we've had difficulty in doing that? At the time of #241 (comment), I expected that the problem was easier than it really is. |
I recommend that we stop and think about what's needed for the WritableStream to add an ExclusiveStreamWriter to it. It would give us some insight about what states async reading should have for representing feeding aspect in order to support lots of requirements we have but without dragged by reading (result emitting) aspect (thinking about both at the same time leads to confusion). For example, "waiting" means that you cannot Regarding the question I placed in #288 (comment), I now prefer (2) for release behavior. I've been feeling that one who called getReader() should be aware of that the stream is locked and should wait until the reader is released. Allowing something like (3) (not only for this point but for any other stuff such as that .closed can be used for getting notified of transition to "close") are huge source of complexity and doesn't benefit user so much, I guess. I'd just add |
I don't think it's quite that bad. The design of ExclusiveReader is largely meant to just wrap the logic already present in Readable(Byte)Stream. There shouldn't be any duplication, at least in theory.
I agree this is a concern. Hopefully we can come up with a fairly straightforward template for doing so. Maybe @tyoshino's suggestions can help simplify the model in that direction.
Good idea, I agree.
Hmm. Writable streams already have so many states. And the states there mean fairly different things from readable streams. So I am a bit reluctant. But maybe adding "locked" to both readable and writable streams would be a good simplification.
Yeah, I think I agree now---having to try to implement even the closed getter does just make things annoying and complicated. I am not 100% sure that a "locked" state will be better. But it is worth trying.
Where would you put released---on the reader, or on the stream? I think having it on the reader is pretty OK. For the stream I would like to make ready or wait() (or pull()) notify of all state transitions, including from "locked" to some other unlocked state. But that might also be a source of complexity.
Yeah. My mental model is generally that you should not pass readers around. Once you take a reader, you control the stream until you release it. And if you call a function (like pipeTo) that you expect to exclusively lock the stream, you should probably stop touching the stream until that function signals that it's done---e.g. by returning a promise, like pipeTo does. What I am mainly getting from this is a few things:
|
This was comment about the functionality we have that we can wait for release of the reader by then()-ing Sorry for not elaborating motivation/context about this proposal much. But this was from analysis about what states we should have for WritableStream with Writer I did yesterday. Here is a list of states in the form of
For convenience,
I thought it may lead to simplification in general if we have sufficient enough promises for capturing only these transition. Sorry, I cannot give concrete rationale for now. But this is background... |
It's helpful background! |
It depends on the design: If we choose (2) in #288 (comment), you are right. Otherwise, we should have two separate pools, one for the stream and the other for the active reader. They sometimes interact with each other. |
Sorry, I overlooked that you agreed with (2). Then you're right, the reader can use byte stream's pool when active. |
Hmm, I thought there would be a way to do (3) without such work, but maybe I am wrong. It was in the async-read universe anyway which I think we are probably not going to do. |
|
source.closed.catch(abortDest); | ||
dest.closed.then( | ||
() => { | ||
if (!closedPurposefully) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is a little unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. If we reach here, that means the user called pipe after closing dest or dest became closed unsolicitedly under locked situation which is just a bug of dest.
So, ... we need this handling in pipeTo algorithm which must detect any bad things, but under assumption that the dest works correctly and the application code is bug-free, this block is unnecessary to handle streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. I didn't consider locks on the dest when writing this so if we have those it might be taken care of.
We might even be able to omit this block if the consequences of doing so are not so bad. That is, if the developer has set up a weird situation (like using a duck-typed dest with a getWriter()
method that returns a writer that conforms to the superficial API contract but doesn't actually prevent closing from the outside), they get weird results.
We'll just want to see which tests fail when we remove this block to make sure.
0673906
to
bfcc05c
Compare
In bfcc05c I have uploaded an initial set of examples for the templated tests. (None of them pass or anything, as they are currently testing against a dummy implementation.) Happy to get feedback. You can already see the templating in action: I am reusing the same tests for an errored stream three times, for three different ways of erroring the stream. |
38260ad
to
1a5c51b
Compare
Based on discussions in #253. The key differences here from the previous async read() commits are: - ReadableStreams no longer have read() methods directly; those only exist on readers. This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior. - read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems, and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them (using { value: zeroLengthViewOntoBuffer, done: true }). - state property is removed (from readable stream) Another new semantic worth mentioning is that you cannot release a reader if the reader has read()s pending; doing so will throw. This slightly complicates the pipe algorithm in the { preventCancel: true } case. This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264. Finally, we re-merge all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.
1a5c51b
to
6c5edb9
Compare
Updated and squashed newest design into a couple commits, but layered them on top of the previous ones so we have those around for reference. All tests passing, including templatized ones. Now to update examples/spec. |
Closing in favor of a new clean pull request for the new model :) |
This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253, we discussed about how the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the readInto(arrayBuffer, ...) method for byte streams must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a read() method derived from their readInto() method, then read() must also be async, across all readable streams.
This is a slight usability upgrade for consumers, in some cases. However, it potentially costs more microtasks when multiple chunks of data would be available synchronously.
In the process of updating the tests to reflect async read, they were given a number of small unrelated tweaks (e.g. to wording, or to eliminate some setTimeout(,0)s).
TODO:
Getting this up there largely so that people can get a sense of how much of a change this is. The changes worth looking at are in lib/readable-stream.js, lib/readable-stream-abstract-ops.js, Examples.md, and index.bs. The rest are tests.
I thought we could get rid of ExclusiveStreamReader since people would be able to get exclusive access to the chunks by just reading them as soon as they are available. However sometimes you want to be able to prevent interference from a stream, but you also want to let chunks accumulate in the internal buffer instead of reading them immediately---for backpressure purposes. So we still need a lock to prevent other consumers from snatching them from you.
I think adding back ExclusiveStreamReader will actually reduce the diff pretty significantly so this is not that big of a change to the non-test files. That is heartening to me. So feel free to put off reviewing for another day. I wanted to get this up here though to show that I am working on the problem and that's why I haven't had much time to review e.g. OperationStream or respond to comments in #253. I will do my best to catch up tomorrow and maybe a bit over the weekend.