Add a flag to determine wether a ReadableStream is disturbed or cancelled #1025

rumkin · 2019-12-15T07:30:51Z

Proposal

Implement flags to determine in a synchronous way wether the stream is disturbed and cancelled as boolean properties ReadableStream#disturbed and ReadableStream#closed.

Rationale

ReadableStream should have a flag which describes if the stream is disturbed or not. Without this flag it's just impossible to understand if the first chunk of the stream would be the first chunk of the data. That's why behaviors to check is stream has been read already is implemented in all major browsers (Chromium, Safari, Firefox) and is using by FetchAPI (Response and Request constructors).

I know there is an opinion against it. But, as I've already showed, without such flags it's impossible to implement reliable solution. Current design creates asynchronous undefined state and doesn't allow to understand wether the stream is correct constructor argument or not and lead to unpredictable delayed race conditions. As I understand such a solution made due to assumption that stream can be non-exclusively owned and as a result cancelled in any time, what seems incorrect.

Example

I'm implementing a server which works with Web StreamAPI. It requires an ability to check, if the stream contains acceptable body or not, to prevent throwing parsing errors on valid request and to detect race condition occurred in the process of request handling. Currently I solve this in a hacky way using lowlevel API from the Response object implementation, without this I will emit error each time someone else read the stream without being sure where in the code it has happened.

Solution with Response constructor:

function isStreamDisturbed(stream: ReadableStream): Boolean {
  try {   
      new Response(stream)
    
      return false
    }
    catch {
      return true
    }
}

const stream = new ReadableStream()
stream.cancel()

isStreamDisturbed(stream) // -> true

Of cause this solution is a hack and couldn't be decided as reliable.

The text was updated successfully, but these errors were encountered:

ricea · 2019-12-16T11:31:14Z

I am leaning towards adding isDisturbed (maybe just as disturbed?) but not isCanceled. We don't even track internally whether the stream was canceled. From the point of view of the consumer of the stream, it doesn't make a difference whether the stream was canceled or simply read until close.

rumkin · 2019-12-16T20:48:23Z

@ricea Replaced with disturbed and closed.

yutakahirano · 2019-12-18T04:20:51Z

disturbed is designed for Response.bodyUsed and I'm not sure if we should advertise its use widely. Personally I'm not a fan of the concept...

rumkin · 2019-12-20T14:02:46Z

@yutakahirano It's not only used for Response.bodyUsed it's mainly used to understand if response body is valid at all. This is why Response's constructor fails and not just set bodyUsed in true when the stream has been already disturbed. Without this restriction such things like ServiceWorker's cache wouldn't work correctly.

yutakahirano · 2020-01-07T05:03:47Z

disturbed represents whether the stream has ever been read. Validity is (or, at least, can be) a different concept I think.

rumkin · 2020-01-07T23:38:41Z

But in the concept of Request/Response, if the stream is disturbed, then it's not a valid body. So here "disturbed" and "valid" are synonyms.

jasnell · 2021-06-29T18:04:04Z

After having just gone through the implementation for Node.js core, I'd definitely be +1 on a disturbed property on ReadableStream.

benjamingr · 2021-08-02T11:13:46Z

@ricea @annevk would a spec PR adding a .disturbed property on ReadableStream be welcome :)?

annevk · 2021-08-02T11:20:14Z

What would the use case be? As discussed in #1105 it's rather specific to the Fetch APIs and unclear if we would do that again if starting fresh.

benjamingr · 2021-08-02T11:24:05Z

@annevk honestly Node only needs it for the fetch API as well so far nodejs/node#39627

@ronag is there any other place we need it?

ronag · 2021-08-02T11:31:37Z

@annevk honestly Node only needs it for the fetch API as well so far nodejs/node#39627

@ronag is there any other place we need it?

So far just for fetch.

domenic · 2021-08-02T12:56:49Z

Still -1 on this; probably it's best if we should close. Fetch is meant to be able to muck with the internals of ReadableStream anyway; keeping them independent is not a design goal.

benjamingr · 2021-08-03T06:18:41Z

Yeah if this is just fetch and nothing else node can just “cheat”

rumkin · 2021-10-18T14:06:55Z

It is cursed practice to produce cheating solutions. And TBH it's just strange to see proposals like this in a standard development body. WhatWG shouldn't lower the bar for itself. This behavior is decreasing reliability of Web Platform by several points and WhatWG IMO should earn points not to spend them. Transparent and sustainable standards are WhatWG's KPI and quality metrics.

Fetch and Stream APIs have been developed by WhatWG, but somehow become inconsistent and now require to be fixed. They throw errors in browsers and these errors aren't fully catchable due to web platform error messages fragmentation. Also there is some kind of undefined behavior in the current Stream design. Now it about to bring fragmentation into Node.js and embeddable solutions. And it will be hard to get rid of it.

There are several solutions which are properly developed, have better maintenance and could fix the issue:

Add a flag to Stream instance.
Add fetch.canConsumeStream() method.
Create DisturbedStream error.
~~Change Fetch API~~.

1. Add a flag to Stream instance

It's one of core feature for consequential data streams, to provide information about own state. And it should work like this:

Developer checks stream's state with a flag !stream.disturbed && !stream.locked.
Claim ownership with stream.getReader() or pass a stream into synchronous function which claims ownership.
Start consuming data from a stream.

Such solution is universal and solve all possible errors and brings freedom to developers to implement any kind of behavior. Without such flag there is undefined state presented.

Pros

Solves the issue with O(1) complexity as: !stream.isDisturbed.
Solve undefined state issues.

Cons

Requires standard modification.

2. Add `fetch.canConsumeStream()` method.

This os ok. In the case if the Fetch API won't change its behavior and will be the only API which requires disturbed flag. But in the case there will be more APIs which will use disturbed flag, there will be more such methods.

It's required for many APIs: web request and response handling, file reading and writing, video and audio processing.
So it's crucial to implement the first option.

Pros

Solves the issue with O(1) complexity as: fetch.canConsume(stream).

Cons

May require duplicate methods in other APIs in the future.

3. Create DisturbedStream error.

It solves potential errors in the future and lets platforms to decide how to implement Fetch API and wether to rely on disturbed flag or not. And let developers to handle the error with simplicity by its name.

Pros

Solves the issue with O(1) complexity with:

    try {
       const req = new Request(url, {body: stream})
       // ...
    } catch (err) {
      if (err.name === 'DisturbedStream') {
        // ...
      }
   }

Cons

Wordy and more resource-consuming.

4. Change Fetch API.

~~It won't be a breaking change~~, if Fetch API would start to accept disturbed streams. So in the case if such behavior is admissible, then this is the better option and the most simple solution for now.

Pros

Easy to implement.
Reduces ambiguity and difference in behavior.

Cons

Doesn't seems to be proper in case of consequential streams.

annevk · 2021-10-18T14:14:12Z

Cheat was probably the wrong word as it's perfectly normal for trusted libraries to hook into internals. Having read your post you haven't really written down any use cases other than asserting there might be scenarios similar to fetch in the future. I think that means we're still stuck on #1025 (comment).

domenic · 2021-10-18T15:08:27Z

Yeah, good reminder to close this issue.

rumkin · 2021-10-18T17:24:03Z

It's normal for trusted libraries to hook into internals, it's not ok to make developer rely on some heuristics to determine what happened with the code.

Consistency and reduced complexity of web platform is itself the best use-case. But moreover current API has inconsistency in design and the first consumer of it – the Fetch API – is an indicator of the problem.

So developers couldn't check validity of streams in constructors, whaat will make API design unnecessary complex: developer should create a class instance, then run some async method to check if everything is fine and constructor arguments are valid.

And the main problem with this solution is redundant delays and memory usage. Here it is the solution by @domenic in node's repo:

async function isReadable(reader) {
  let readable = true;
  reader.closed.then(() => { readable = false; }, () => { readable = false; });
  await Promise.resolve();

  // If the stream was closed or errored then the closed promise would have settled by now and updated readable.
  // If not then it's readable.
  return readable;
}

It creates 3 promises and 3 jumps over event loop stack just to determine stream state. Moreover function scope with its boolean, function and one of this promises will live as long as stream is not closed and thus would hold memory. In the case of high loads, this will be memory pressure out of nowhere. This check should be made in a single synchronous call.

As a conclusion, current API design:
a. overcomplicated,
b. brings redundant load,
c. distribute complexity over multiple environments, which are Webkit, Firefox, Chromium, Node, Deno, embeddable JS and other JS environments.

So the question here is what the use case for making Stream status check asynchronous?

@domenic Please, reopen this issue. Also could you answer on the question above or provide a link where I could find such answer, because it seems like you are the one who's responsible for the API and all the decisions. I don't want to spend anyone time, but the issue isn't solved.

domenic · 2021-10-18T17:28:58Z

Nobody has provided use cases, beyond emulating a fetch API which we regret. We won't be reopening the issue until such use cases are provided.

rumkin mentioned this issue Jan 31, 2020

Prefer web APIs over node APIs ipfs/js-dag-service#13

Open

ricea mentioned this issue Feb 1, 2021

Introspection: disturbed or locked #1105

Open

benjamingr mentioned this issue Aug 2, 2021

ReadableStream internal state nodejs/node#39627

Open

domenic closed this as completed Oct 18, 2021

szmarczak mentioned this issue Dec 9, 2021

fix: errored stream is disturbed nodejs/undici#1134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a flag to determine wether a ReadableStream is disturbed or cancelled #1025

Add a flag to determine wether a ReadableStream is disturbed or cancelled #1025

rumkin commented Dec 15, 2019 •

edited

Loading

ricea commented Dec 16, 2019

rumkin commented Dec 16, 2019

yutakahirano commented Dec 18, 2019

rumkin commented Dec 20, 2019

yutakahirano commented Jan 7, 2020

rumkin commented Jan 7, 2020

jasnell commented Jun 29, 2021

benjamingr commented Aug 2, 2021

annevk commented Aug 2, 2021

benjamingr commented Aug 2, 2021

ronag commented Aug 2, 2021 •

edited

Loading

domenic commented Aug 2, 2021

benjamingr commented Aug 3, 2021

rumkin commented Oct 18, 2021 •

edited

Loading

annevk commented Oct 18, 2021

domenic commented Oct 18, 2021

rumkin commented Oct 18, 2021

domenic commented Oct 18, 2021

Add a flag to determine wether a ReadableStream is disturbed or cancelled #1025

Add a flag to determine wether a ReadableStream is disturbed or cancelled #1025

Comments

rumkin commented Dec 15, 2019 • edited Loading

Proposal

Rationale

Example

ricea commented Dec 16, 2019

rumkin commented Dec 16, 2019

yutakahirano commented Dec 18, 2019

rumkin commented Dec 20, 2019

yutakahirano commented Jan 7, 2020

rumkin commented Jan 7, 2020

jasnell commented Jun 29, 2021

benjamingr commented Aug 2, 2021

annevk commented Aug 2, 2021

benjamingr commented Aug 2, 2021

ronag commented Aug 2, 2021 • edited Loading

domenic commented Aug 2, 2021

benjamingr commented Aug 3, 2021

rumkin commented Oct 18, 2021 • edited Loading

1. Add a flag to Stream instance

Pros

Cons

2. Add fetch.canConsumeStream() method.

Pros

Cons

3. Create DisturbedStream error.

Pros

Cons

4. Change Fetch API.

Pros

Cons

annevk commented Oct 18, 2021

domenic commented Oct 18, 2021

rumkin commented Oct 18, 2021

domenic commented Oct 18, 2021

rumkin commented Dec 15, 2019 •

edited

Loading

ronag commented Aug 2, 2021 •

edited

Loading

rumkin commented Oct 18, 2021 •

edited

Loading

2. Add `fetch.canConsumeStream()` method.