Define when we consider a data is consumed by the reader #76

tyoshino · 2014-02-10T05:43:16Z

The consumer code touching the ReadableStream API cannot obtain fine-grained control over how much data to pull. It's hidden. So, in case, big chunks are used by the data source, the consumer code may get unexpectedly big chunk from a read() call. It's ok to get a big chunk, but I don't want the ReadableStream to start pulling new data on that read() call as the consumer code is being overwhelmed by that big data chunk and it shouldn't be immediately considered to have been consumed.

In the W3C spec, I tried to address this by amountBeingReturned. Could you please investigate this issue?

domenic · 2014-02-10T06:45:27Z

Hmm. At first I was really concerned after thinking about this. But after thinking longer, I am not so sure it is a problem. Let me know what you think of this reasoning.

A readable stream is not concerned with the consumer. It maintains an internal buffer, with a high-water mark, by itself. It uses this internal buffer and high-water mark to determine when to apply backpressure to the underlying source.

So, it doesn't matter what the consumer code thinks of how big the chunks are. The only thing that matters is whether the readable stream's buffer is too full, or not. If the consumer code gets a big data chunk, then it chooses not to call read() again for a while. But then the readable stream decides what implications this has for applying backpressure.

More concretely. Consider a fast readable stream with large chunks being consumed slowly. Let's say that the readable stream's buffer is already full, so backpressure is being applied to the underlying source. The consumer calls read(). Then:

If the consumption of the chunk drained the entire buffer, then it is the readable stream's job to anticipate future requests, and turn off backpressure in order to gather data up to the high water mark in preparation for future reads---whenever the consumer is ready.
If the buffer is still draining, it will continue to apply backpressure, letting the buffer drain before turning the backpressure off.
If the buffer is still at or above the high-water mark, it will of course continue to apply backpressure.

You seem to be worried about case 1. But I argue that it is not a problem; it is still good to anticipate future data requests. If anything, it is is simply a misconfigured readable stream: it has a high-water mark that is too low for the possible data chunk sizes coming through. You should never really be able to drain the entire buffer, from above the high water mark to empty, with one read().

What do you think?

tyoshino · 2014-02-10T08:31:52Z

A readable stream is not concerned with the consumer.

Yeah. It's a question what a certain role should be done by the stream or needs involvement by the consumer (and information only the consumer has).

You seem to be worried about case 1

Yes, I'm concerned with cases like 1. But there was a misunderstanding at my side (#75). This won't be an issue if there's no low-water mark.

So, strategy is intended to be configured by the consumer? I'd like to start thinking of a concrete usage. Suppose that we're redesigning XHR to return a ReadableStream. How we pass the initial value of high water mark and what do we do when the consumption speed changes? Maybe we wrap strategy behind some subclass and show it to the consumer code.

domenic · 2014-03-10T04:57:22Z

Sorry for the delay.

So, strategy is intended to be configured by the consumer?

Not quite; the strategy is intended to be configured by the stream creator. In the case of XHR, that would be the browser, which would probably set either an appropriate HWM for network operations generally, or perhaps one that depends on current network conditions. (But, see #13 (comment) for how nobody ever changes Node's defaults.)

tyoshino · 2014-03-28T09:15:36Z

To get the best TCP performance over high latency network, we need to allocate large buffer sufficient to have receive window equal to bandwidth delay product. Similar situation may happen for other sorts of producers.

Suppose we wrap a TCP socket with a ReadableStream. Since pull doesn't pass along how much data the RS can buffer (HWM), the socket (BSD socket itself doesn't, but if we have better API in the future which allows us to adjust receive window dynamically) cannot adjust amount to pull from the server (receive window). needsMore only signals back pressure via return value of push. So, all we can do is setting HWM to BDP. But for slow consumer, it's undesirable to have so big window. It just leads to unexpected inflation of buffer. While, for fast consumer, small HWM is just frustrating since BDP won't be filled.

So, I think

we should be passing along kind of information say, space available (HWM - filled amount) or speed in a RS on pull. It doesn't need to be limiting push, just work as a hint is OK.
and also there should be some interface for a consumer to tell its consuming speed to the RS (or strategy object?) in addition to the signal of read() calling speed.

Sorry if this has already been well discussed.

domenic · 2014-06-17T16:29:27Z

Consolidated into #119.

domenic added the question label Feb 10, 2014

domenic added the documentation label Mar 4, 2014

domenic added the buffering-strategies label Apr 14, 2014

tyoshino mentioned this issue Apr 24, 2014

Can we add buffering strategies via a transform stream? #24

Closed

domenic mentioned this issue Jun 17, 2014

Validate queueing strategies are sufficiently powerful #119

Open

domenic closed this as completed Jun 17, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define when we consider a data is consumed by the reader #76

Define when we consider a data is consumed by the reader #76

tyoshino commented Feb 10, 2014

domenic commented Feb 10, 2014

tyoshino commented Feb 10, 2014

domenic commented Mar 10, 2014

tyoshino commented Mar 28, 2014

domenic commented Jun 17, 2014

Define when we consider a data is consumed by the reader #76

Define when we consider a data is consumed by the reader #76

Comments

tyoshino commented Feb 10, 2014

domenic commented Feb 10, 2014

tyoshino commented Feb 10, 2014

domenic commented Mar 10, 2014

tyoshino commented Mar 28, 2014

domenic commented Jun 17, 2014