Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for bulk read operation #1003

Open
vasilvv opened this issue Jun 28, 2019 · 1 comment
Open

Add support for bulk read operation #1003

vasilvv opened this issue Jun 28, 2019 · 1 comment

Comments

@vasilvv
Copy link

vasilvv commented Jun 28, 2019

I've received feedback from multiple web developers who work with streams and stream-like API that, when they are transmitting huge chunks of data (usually video), the per-promise overhead can seriously harm the performance. This can be solved by allowing the application to specify the desired amount of data in the stream queue and resolving the promise with that desired amount in one go.

The specific proposal in mind I have is a bulkRead method, which would have a signature of

Promise<ReadResult> bulkRead(unsigned long sizeRequested, unsigned long timeoutMs, bool coalesce);

The resulting promise would be resolved if one of those things happen:

  1. sizeRequested amount is available in the queue, where the size is in chunk count or in bytes, depending on how the controller defines it.
  2. timeoutMs milliseconds pass since bulkRead() was called.
  3. End of stream is reached.

The value returned in the read result would normally be an array of chunks currently in the queue, up to sizeRequested (but less if timeout/EOS happened); if coalesce is specified, a value equivalent to calling concat method on all chunks is returned, instead of array.

This could potentially synergize well with other Streams performance proposals, like BYOB (saving on buffer allocation) or #757 (zero-copy).

Some alternatives considered:

  1. Fixing promises to be always performant. This is already the usual case. The long tail of cases when the current API is slow (I/O patterns that result in 1000s of chunks/second, JS engines on constrained platforms, etc) is fairly diverse and occurs on many different layers. Even if we could solve them, the proposed API would be a defense-in-depth against this family of issues.
  2. Setting chunk size in originating API (e.g. fetch) would probably work around one of the current pain points. One of the main reasons I'm not particularly enthusiastic about this is that in some of the use cases, a stream can be a mix of large and small messages, in which case increasing chunk size would result in small messages having higher latency.

Some similar APIs in other systems, for reference and design inspiration:

  • bulkRead for bytestreams is equivalent to what one would normally get with Linux epoll/read.
  • bulkRead for message-based streams (coalesce set to false case) is similar to Linux recvmmsg call.
@vasilvv vasilvv changed the title Add support for bulk write operation Add support for bulk read operation Jun 28, 2019
@ricea
Copy link
Collaborator

ricea commented Jul 11, 2019

Sorry for the slow response.

I think as a performance-oriented API, this has a high bar to meet. We need to show that it unlocks performance gains that would be otherwise unachievable.

In some respects it is similar to BYOB readers, which by avoiding repeated buffer allocations supposedly unlock higher performance. Unfortunately, real-world performance benefit has never been shown, because no browsers have implemented BYOB readers.

I think I want to see a demonstration of real performance improvement in a real-world situation before adding this to the standard. That means we need an implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants