Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototyping several changes for better support for reading bytes #287

Closed
wants to merge 93 commits into from

Conversation

tyoshino
Copy link
Member

No description provided.

@tyoshino
Copy link
Member Author

This is an extended version of WritableStream with the following functionality:

  • single API to return a connected pair of writable side and readable side
  • drained semantics on the readable side
  • readable side can acknowledge read chunks
  • writable side can receive acknowledgement info

See the example in the test file to see how this works for ReadFile/fread style source with buffer allocator residing in the source side.

write() returns a structure representing the status of the operation and read() returns an object to acknowledge the operation. These are for synchronous pumping, but can be removed if it's considered to be too complex.

async readInto() can be realized by using the writable side of the operation stream. You'll receive eof via write()'s return value. This serves for needs to allow users to bring their own buffer than allocating buffers inside a source. If write() being used for reading is confusing, we could rename write() of the operation stream to e.g. issue() and read() to accept(), take(), etc.

@tyoshino
Copy link
Member Author

I meant that I guess this single API can serve for various use cases.

@tyoshino
Copy link
Member Author

The following stuffs proposed in recent discussion are also incorporated

  • readableAmount
  • window

@tyoshino
Copy link
Member Author

tyoshino commented Mar 1, 2015

Now, queue and promises are separated.

See this file for ReadbleStream and WritableStream interface definition.
https://github.com/whatwg/streams/blob/operationStream/ThinStream.md

  • Regarding locking, I aggressively disallowed all methods on inactive stream/reader/writer. This relieves us from the task to give them some possibly convenient (but maybe not so many would use) semantics to them while inactive.
  • Separated "operation" concept from the ReadableStream and WritableStream again. They now lives in OperationQueue library.
  • Methods are grouped based on their purpose.
  • Employed revealing constructor pattern.
  • Added unstoppable push source example for demonstration.

I haven't demonstrated, but:

@tyoshino
Copy link
Member Author

tyoshino commented Mar 1, 2015

I have also been trying to keep queue handling class useful for source/sink implementor. But not yet given good shape.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 1, 2015

Sorry for not yet having responded to your questions, Domenic. I'll respond to them from the office tomorrow...

@tyoshino
Copy link
Member Author

tyoshino commented Mar 2, 2015

Domenic: Response to #287 (comment) cont'd

These are really interesting observations. Can you explain more why a readable stream would need acknowledgement/error for each chunk read?

When we were discussing the idea of having a buffer pool inside a readable stream and giving them to the user on reading, we found that we need to explicitly tell the source when the buffer becomes ok to reuse. As you said, ReadableStream's read() implicitly acknowledges the data, it's not always that we finish consuming contents on the read buffer synchronously. So, I attempted to add explicit and delay-able acknowledgement feature as that read() returns an object on which methods for acknowledgement are placed.

const dataAndAckMethod = s.read();
// Use data part.
...
// When data is comsumed (possibly in another task), call the ack method

In addition to that, I thought that similarly to bring-your-own-buffer model readable stream, we may want to use please-write-to-this-buffer model writable stream. I.e. suppose that we have a sink that prepares and exposes some memory region as an ArrayBuffer and ask the producer code to write bytes into it. To wrap this sink with some streams variant API, we need a method which returns a region for writing data and also returns some methods to tell the sink completion of the writing.

const regionAndCompleteMethod = s.write();
// Write data to the region
...
// When done (possibly in another task), call the complete method.

This reduces the number of ArrayBuffer allocation compared to please-give-me-a-filled-array-buffer model writable stream. This s.write() is logically equivalent to current WHATWG ReadableStream if the return value type is changed to the Operation object.


Anyway they're just attempts to address each of the semantics separately. Now, I'd like to go back to investigate which (set of) semantics we should support

Source of bytes

  • Choice: who allocates buffers
    • Code prepares and passes an ArrayBuffer to source. Source writes bytes to it and notifies code of completion.
      • writing bytes completes synchronously (when read() returns, writing is finished).
      • Source notifies code of completion asynchronously.
    • Source prepares and passes a filled ArrayBuffer to code.
      • Code returns the ArrayBuffer to source when done consuming.
      • GC collects consumed ArrayBuffers.

Sink of bytes

  • Choice: when write is allowed
    • Sink automatically starts accepting write (w/ backpressure hint)
    • Some preceding method call makes sink accept write
  • Choice: who allocates buffers
    • Sink prepares and passes an ArrayBuffer to code. Code writes bytes to it and notifies sink of completion.
      • Choice: allocation is async? sync?
        • write method returns an ArrayBuffer synchronously
        • write method returns an ArrayBuffer asynchronously
    • Code prepares and passes a filled ArrayBuffer to sink.
      • Choice: recycle the buffer?
        • Sink consumes contents synchronously, so code can start reusing the buffer immediately.
        • Sink notifies code once I finish consuming bytes from it to allow code to recycle it.
        • Sink takes ownership of the ArrayBuffer and have the GC collect it once consumed.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 2, 2015

Again, reply to Domenic's comment #287 (comment)

space/waitForSpaceChange are on writable stream. I guess they are more sophisticated versions of the current backpressure signal, of "writable" or "waiting", that can be used to communicate how much can be written. This makes sense.

Right.

I think it mostly makes sense for WritableByteStream, because then when piping you could ask for as many bytes from the readable byte stream as space indicates on the target writable stream. Maybe it could make sense for general readable streams using their size() strategy method, but that is kind of complicated. E.g. it assumes compatible notions of size between the source stream and the target. Hmm.

Yes. There could be some translation between each pair of source type and target type, but not always so simple and worth to be abstracted in the streams library, I guess. For byte streams, yes, I think it would be useful.

I don't quite understand window though. I imagine it maps to TCP window stuff but I don't understand that perfectly either :(. Could you help me understand it? And, how general is such an idea---just byte streams, or anything with a size, or...? Does it depend on the capabilities of the underlying source, so e.g. it works for sockets but not for files?

Right. It's inspired by TCP. For byte streams, we can specify how much data we want to receive from the remote host, read from kernel, read from other process, etc. using window. In some old thread, I said that I guessed it's not rare that we want to control the high water mark of the strategy from user code based on current situation e.g. back pressure from the sink the data read from the readable stream finally reaches. Basically, it's fine to understand this as a knob for adjusting hwm.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 2, 2015

Addition of window and space are just a bonus. I felt that they might be connected with the issue we're trying to solve, but it's almost turned out to be unrelated. So, I think we can post pone it.

@tyoshino tyoshino changed the title Operation stream Prototyping several changes for better support for reading bytes Mar 3, 2015
@domenic
Copy link
Member

domenic commented Mar 3, 2015

Continued from #253 (comment)

From operationStream, I'd like to incorporate:

  • queue / stream separation

Does this affect public API (either for stream creators or users)?

I notice the new comments at the top of the file (which are extremely helpful) have slightly different underlying sink/source designs. Instead of being based on strategy they give more direct control---which is probably a good change. Is that part of queue / stream separation?

  • revised set of promises

These are interesting. To summarize:

  • Readable stream:
    • readable which is fulfilled when the stream enters the readable state
    • errored which is fulfilled when the stream enters the errored state
    • No more closed (seems bad); no more ready (possibly can be replaced by above two)
    • How does this fit with the async pull() + sync read() idea?
  • Writable stream:
    • Similar to above: writable and errored; no more closed
    • Also has waitSpaceChange() promise-returning method.

I am not sure these are an improvement, personally. What was the motivation?

revised preconditions for getReader() / getWriter() and release()

The addition of the "locked" state is hopefully not too annoying. Prototyping will help confirm. But in general (I know I'm repeating myself, sorry) I am in favor of changing the preconditions in ways that make things more straightforward.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 4, 2015

Does this affect public API (either for stream creators or users)?

Creators are affected. Users are not affected.

I notice the new comments at the top of the file (which are extremely helpful) have slightly different underlying sink/source designs. Instead of being based on strategy they give more direct control---which is probably a good change. Is that part of queue / stream separation?

Yes, that's exactly what I meant by queue / stream separation. Maybe there're still sources / sinks that really want to use a queue. We can provide some additional helper classes for them. I.e.

  • ThinWritableStream + QueueSink === Current WritableStream
    • QueueSink would have API similar to current WritableStream's underlying sink
  • ThinReadableStream + QueueSource === Current ReadableStream
    • QueueSource would have API similar to current ReadableStream's underlying source

Thin.* are used by sources / sink which needs freedom.

revised set of promises

I am not sure these are an improvement, personally. What was the motivation?

After bunch of changes, I'm not so inclined to push this. They were for some simplification. I don't remember so much... I've almost reverted them for now.

The addition of the "locked" state is hopefully not too annoying. Prototyping will help confirm. But in general (I know I'm repeating myself, sorry) I am in favor of changing the preconditions in ways that make things more straightforward.

We've almost reached agreement that it's fine as far as we can hide internal state and forbid operations on the stream while it's locked. Then, prohibiting everything should be fine. Yes, we need to be careful not to decrease usability. But maybe ok...

On the latest commit, I've prototyped byte source / stream (reader) separation as discussed around #253 (comment).

@tyoshino
Copy link
Member Author

tyoshino commented Mar 4, 2015

Please ignore files with top comment "WIP WIP WIP". They're incomplete snippets.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 4, 2015

Interfaces

  • thin-readable-byte-stream.js: manual pull readable stream
  • thin-readable-stream.js: auto pull readable stream
  • thin-writable-byte-stream.js: manual disposing writable stream
  • thin-writable-stream.js: auto disposing (no garbage port) writable stream

Utility

  • thin-stream-base.js: pipeStreams is defined and some helpers are exposed for building transform stream

Examples

  • fake-file-backed-byte-source.js: Sample implementation of a file backed byte source from which we can get either of manual pull readable stream or auto pull readable stream.
  • mock-byte-sink.js: Sample implementation of a file backed byte sink from which we can get manual disposing writable stream or auto disposing readable stream.
  • stream-queue.js: A queue with readable stream side and writable stream side.
  • thin-stream-utils.js: Helpers for examples

Test

  • test/experimental/thin-stream.js

@tyoshino
Copy link
Member Author

Almost all of the ideas prototyped in this branch has been implemented / abandoned / sublimed into somethings else and landed till e601d69.

The idea of ReadableStream with acknowledgement is well described in some issues e.g. #324. The reference implementation has been drastically modified. Because of these reasons, we don't need to keep this PR any more.

Closing.

@tyoshino tyoshino closed this Mar 28, 2016
@tyoshino tyoshino deleted the operationStream branch March 28, 2016 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants