Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed backward-incompatible change: requiring cloning all requests/responses #61

Closed
domenic opened this issue Jun 8, 2015 · 74 comments

Comments

@domenic
Copy link
Member

domenic commented Jun 8, 2015

In yutakahirano/fetch-with-streams#37 @yutakahirano, @wanderview, @tyoshino and I have been going in circles with regard to a pretty big mismatch between how streams work and how Request/Response currently work. We have a proposal that would make everything a lot simpler, but it does require a backward-incompatible change. Hoping to get opinions on whether it's a feasible change. /cc @slightlyoff @jakearchibald

Basically, the premise is that it's extremely surprising to developers that

var req = new Request(url, { method: "POST" });
fetch(req);
fetch(req);
cache.add(req);

works, but

var req = new Request(url, { method: "POST", body: "foo" });
fetch(req);
fetch(req);
cache.add(req);

fails on the second fetch, since the body is drained. That is, the latter case requires cloning before use, whereas the former does not.

We propose that:

  • Passing a request to fetch or cache.add, or calling .text() etc., sets req.bodyUsed to true.
  • If any of those methods gets a request with req.bodyUsed as true, they throw.

In my opinion this helps the story a lot. Right now the rules for when you need to clone are pretty darn arbitrary, and in fact can be a refactoring hazard, as shown above. It also solves a lot of headaches around streams + fetch integration---if you want to dive into the linked thread, you'll see that we're having a really hard time rationalizing bodyUsed and related behavior in terms of streams.

However, there might be people out there already using multi-fetch/cache on empty-body requests/responses. So this might be a non-starter. I suggested adding use counters, but that might take a while. What are others thoughts? I am hopeful that this is a rare pattern, but I might be wrong...

@wanderview
Copy link
Member

I just want to note there are alternatives here.

For example, streams could track and expose a .drained state. This is a boolean. Its cheap. As far as I can tell it has no practical downsides. Its just not theoretically pure to the current vision of streams.

Obviously @domenic and I disagree here. I just want to point it out as an option, though, before we decide to break backward compat.

@domenic
Copy link
Member Author

domenic commented Jun 8, 2015

Its just not theoretically pure to the current vision of streams.

That's really overstating it. It mutates streams into a completely new data structure. You wouldn't ask someone to add a .drained to arrays, asking if they had started out nonempty but then someone .pop()ed them into emptiness.

@wanderview
Copy link
Member

You wouldn't ask someone to add a .drained to arrays, asking if they had started out nonempty but then someone .pop()ed them into emptiness.

I guess I don't think that's a fair comparison. Streams and arrays are different.

For example, can you close an array so data can't be put into it anymore? Do arrays restrict access to sequential value with only a single read each? These are features of streams that arrays do not have. And they are features which imply a drained state.

Conceptually streams do have a drained state; closed and all possible data read.

@wanderview
Copy link
Member

Of course, now that I think about it some more (sorry, just getting back from PTO so there are cobwebs) I don't think .drained as I defined it above completely solves the problem.

What we really need to know is if the stream is closed, all possible data read, and at least one chunk has ever been read.

That's a bit more unusual, I admit, but I still don't think it would be a problem to implement. We could also just expose it as an algorithm concept in streams so that fetch can use it but its not available to script.

Or there is the wrapper alternative... (I don't recall why that was not adequate...)

@domenic
Copy link
Member Author

domenic commented Jun 8, 2015

I feel like we already have a thread for debating the semantics of streams. I was hoping to use this one to solicit opinions on making the cloning versus not-cloning story sane.

@annevk
Copy link
Member

annevk commented Jun 9, 2015

I'm not really opposed to this. Would also for allow other use-once objects to be added to Request/Response later on. I would like to rename the property if we do this to isUsed or used.

@wanderview
Copy link
Member

I'm happy to defer to @annevk here.

@annevk
Copy link
Member

annevk commented Jun 9, 2015

What do you think @tyoshino? Is this a change Chrome is willing to make?

@jakearchibald
Copy link
Collaborator

var req = new Request(url, { method: "POST", body: "foo" });
fetch(req);
fetch(req);
cache.add(req);

fails on the second fetch, since the body is drained.

Does it? https://fetch.spec.whatwg.org/#concept-http-network-or-cache-fetch step 1 seems to perform a clone, or am I misunderstanding? +@annevk

@jakearchibald
Copy link
Collaborator

This change would break sites that took code from http://jakearchibald.com/2014/offline-cookbook/#on-network-response - given we're starting to see high profile users of ServiceWorker (and therefore fetch & cache), I'm really nervious about this change unless it's backed up by usage stats.

@domenic
Copy link
Member Author

domenic commented Jun 9, 2015

Wait, but isn't that code hilariously broken for non-empty-body requests anyway?

@jakearchibald
Copy link
Collaborator

That's true, but bodied requests will always fail going into the cache. They're GET only.

@domenic
Copy link
Member Author

domenic commented Jun 9, 2015

I guess I don't see how anyone could seriously be using that cookbook code then on a real site, so it doesn't seem like a good argument. If anything I'm now more concerned about the footguns that current patterns allow.

@jakearchibald
Copy link
Collaborator

http://jakearchibald.com/2014/offline-cookbook/#putting-it-together does make it clear that you'd use a conbination of patterns depending on url & method, but I agree that the caching examples should if (request.method != 'GET') return;.

@tyoshino
Copy link
Member

Re @jakearchibald #61 (comment)

https://fetch.spec.whatwg.org/#dom-global-fetch runs the Request ctor. Then, the step 31 of https://fetch.spec.whatwg.org/#dom-request sets used flag on req as it has non-empty body.

@tyoshino
Copy link
Member

It seems it's common pattern that we fetch() and then run cache.put() using the same Request instance (GET). I guess that users already started using the combination.

I'm personally fine with the change regarding the Fetch API in window (included from Chrome 42, last stable). But jakearchibald's view is more important as there're more users in SW.

@yutakahirano
Copy link
Member

@tyoshino
Copy link
Member

Sorry. We just recalled that the current Chrome (stable 43) "incorrectly" behaves as @domenic's idea. Chrome had a bug that it errors the second fetch() in the following scenario:

var r = new Request('http://example.com');
fetch(r);
fetch(r);

In the Chromium bug entry @yutakahirano just posted, we've fixed it to conform to the current Fetch spec. So, till now, no one should be able to reuse a null-body Request for multiple fetch()es.

@annevk
Copy link
Member

annevk commented Jun 10, 2015

Excellent, so let's unfix that as a start and then see about renaming the property.

@tyoshino
Copy link
Member

Yeah, it's accidentally great misbehavior.

Regarding Response, Chrome 43 does allow calling cache.put() with a Response with null-body repeatedly. But it's less common to receive an empty response than a non-empty one. jakearchibald's guide at http://jakearchibald.com/2014/offline-cookbook/#on-network-response includes response.clone() call. Unless one is enthusiastic about optimization (omit calling .clone() for null body responses), the impact of the change would be small, I guess.

@jakearchibald
Copy link
Collaborator

My concerns are more around:

var request = new Request('/');
fetch(request).then(response => {
  caches.open('whatever').then(c => c.put(request, response));
});

…the above currently works in Chrome, are we suggesting it should fail?

@annevk
Copy link
Member

annevk commented Jun 10, 2015

Yes we are.

@jakearchibald
Copy link
Collaborator

I'm really worried about breaking existing sites, especially if we're going in dataless.

At the very least, we'll need to have fetch/cache.put etc implicitly clone Requests/Responses that have an empty body, and show a deprecation warning in the console.

We'll also need to keep bodyUser & show a deprecation warning in the console on use.

@annevk
Copy link
Member

annevk commented Jun 10, 2015

Well per Chrome fetch() would not need to implicitly clone. It seems the only problem might be cache.put(). Also, it's still very early days. We should be prepared to make some API changes I think. That's the whole reason we put it out there, to get feedback on what is broken etc.

@jakearchibald
Copy link
Collaborator

Right, but "Facebook goes down for all Chrome users due to 'service worker'" isn't something I want to read.

@wanderview
Copy link
Member

At the very least, we'll need to have fetch/cache.put etc implicitly clone Requests/Responses that have an empty body, and show a deprecation warning in the console.

Uh, wouldn't the clone fail as well if the used flag is set?

@tyoshino
Copy link
Member

Oh, sorry. I didn't check the sequence in #61 (comment). Right, cache.put() is checking body nullness since long time ago.

@jakearchibald
Copy link
Collaborator

Uh, wouldn't the clone fail as well if the used flag is set?

Yes, but what would cause that to happen?

@jakearchibald
Copy link
Collaborator

If I'm understanding correctly:

  • fetch(req) considers the body to be optional
  • cache.put(req, res) requires the req body to be null (enforced by GET), and the res body to be not-null
  • r.json() etc require the body to be not-null (this would be a change)
  • caches.match(req) does not use the body of req
  • Methods that use the body, only mark it as used if it's not-null (or reject/throw if body is required)

Therefore a method that takes a request/response can be described as one of:

  • Does not use body - so bodyUsed has no impact
  • Requires body - throws if bodyUsed is true, otherwise sets bodyUsed
  • Considers body - throws if bodyUsed is true, otherwise sets bodyUsed if body is present

@wanderview
Copy link
Member

I think cache.put() also calls the Request constructor now to sanitize the input. That would set the used flag through some mechanism, no?

Which step in https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-put doest this?

The step 10 says

Set the request dictionary member of o to a Request object associated with r.

Isn't this "Request" the same object as the argument of cache.put() in the case where the argument is a Request object (not a string)?

You're right. Its just .add()/addAll() that do that.

@wanderview
Copy link
Member

cache.put(req, res) requires the req body to be null (enforced by GET), and the res body to be not-null

Where does cache.put() require the body to be not-null? I've been accepting null body in our implementation.

@annevk
Copy link
Member

annevk commented Jun 25, 2015

@jakearchibald that ignores streams. We need to figure out a model that makes sense with streams.

I have a proposal. We keep the new offset feature to figure out if a stream has been used.

In addition, to solve #55, we give Request an "obsolete flag", set for requests during new Request() for requests with non-null bodies.

In addition we rewrite everything in terms of a high-level "validate the Request/Response stream" algorithm that a) checks for the offset to be 0 b) checks for the obsolete flag to be unset which can be reused by put()/new Request().

Here are some examples to show what this means:

req = new Request({body: ""})
assert(req.body !== null)
assert_false(req.bodyUsed)
fetch(req) // success
assert(req.body === null) // due to transfer
assert_true(req.bodyUsed) // due to obsolete flag
assert_reject(fetch(req)) // due to obsolete flag

req = new Request()
assert(req.body === null)
assert_false(req.bodyUsed)
fetch(req) // success, obsolete flag not set due to null body
assert(req.body === null)
assert_false(req.bodyUsed)
fetch(req) // success

req = new Request({body: "Hallo wereld!"})
assert(req.body !== null)
assert_false(req.bodyUsed)
fetch(req) // success
assert(req.body === null) // due to transfer
assert_true(req.bodyUsed) // due to obsolete flag
assert_reject(fetch(req)) // due to obsolete flag

req = new Request({body: "Hallo wereld!"})
req.body.getReader()
assert(req.body !== null)
assert_false(req.bodyUsed)
assert_reject(fetch(req)) // because lock not released

req = new Request({body: "Hallo wereld!"})
reader = req.body.getReader()
reader.releaseLock()
assert(req.body !== null)
assert_false(req.bodyUsed)
fetch(req) // success

req = new Request({body: "Hallo wereld!"})
reader = req.body.getReader()
reader.read().then(() => {
  reader.releaseLock()
  assert(req.body !== null)
  assert_true(req.bodyUsed) // due to offset
  assert_reject(fetch(req)) // due to offset
})

stream = new Pipe()
req = new Request({body: stream})
assert(req.body !== null)
assert_false(req.bodyUsed)
fetch(req) // success
assert(req.body === null) // due to transfer
assert_true(req.bodyUsed) // due to obsolete flag
assert_reject(fetch(req)) // due to obsolete flag

So bodyUsed is defined as associated stream's offset being greater than 0 or obsolete flag being set. And we end up treating null differently from the empty string.

I think this would be compatible with what we have today and allow for streams.

@jakearchibald
Copy link
Collaborator

@wanderview

Where does cache.put() require the body to be not-null? I've been accepting null body in our implementation.

hm, yeah, maybe it's ok to accept that. Makes sense for redirects.

@jakearchibald
Copy link
Collaborator

@annevk that all looks fine to me.

Why does body become null after fetch? I mean, I understand the body is useless at this point (given the current set of methods), but I don't think I'd expect it to become null.

@jakearchibald
Copy link
Collaborator

Will calling .text() etc update .body.bytesRead? Could I use it to create a progress meter (assuming Content-Length is correct)?

@domenic
Copy link
Member Author

domenic commented Jun 30, 2015

Will calling .text() etc update .body.bytesRead? Could I use it to create a progress meter (assuming Content-Length is correct)?

The design has evolved a bit. Remember that .text() in specific, and anything that reads from the stream in general, is going to use an exclusive reader. The whole point of the exclusive reader is to prevent others from viewing what's going on with the stream, since that would preclude optimizations like off-main-thread work (e.g. consider .json() doing off-main-thread JSON parsing). So, not really. You have to be the one reading to get progress, as always. Anything else means that we have to do synchronization back to the main thread and all that fun stuff.

@domenic
Copy link
Member Author

domenic commented Jun 30, 2015

Er, I accidentally edited the punchline out of my last post before submitting: the .offset property (formerly .bytesRead) belongs on the reader, not on the stream. That's why you can't see it while using .text(). whatwg/streams#367

@jakearchibald
Copy link
Collaborator

the .offset property (formerly .bytesRead) belongs on the reader, not on the stream

Aha, that's the bit I was missing. Ta.

@annevk
Copy link
Member

annevk commented Jun 30, 2015

Why does body become null after fetch?

fetch() invokes new Request() which transfers the body from one place to another as part of sanitizing requests.

@jakearchibald
Copy link
Collaborator

@annevk why is it a 'transfer' rather than bodyUsed or stream lock?

@annevk
Copy link
Member

annevk commented Jul 1, 2015

@jakearchibald I don't understand. fetch() creates a new Request object. I don't really see how that would work otherwise.

@jakearchibald
Copy link
Collaborator

@annevk the way I thought it worked was the new request's stream would be piping the stream of the request passed to its constructor. So:

var r1 = new Request(url, { body: stream });
var r2 = new Request(r1);

Both r1 and r2 would have separate streams. r2 would read from r1's stream to get the output for its stream.

However, your model avoids the additional stream, which I guess is better as long as the transfer isn't confusing for developers.

@annevk
Copy link
Member

annevk commented Jul 1, 2015

We could do that too I suppose. That would leave us with one less null to worry about. I guess implementations could still move the pointer and "lie" to JavaScript...

Module that, does my proposal work? I would prefer it if everyone that reviewed it said so.

@domenic
Copy link
Member Author

domenic commented Jul 1, 2015

@annevk re #61 (comment)

a) checks for the offset to be 0

in whatwg/streams#367 and linked comments, the plan was to compare offset at construction time with offset at fetch/cache.put/etc. time, not to check that it's always zero. But I guess that is the subject of whatwg/streams#367 (comment) so for "v1" we will check it's zero?

Here are some examples to show what this means:

Yay, these are very helpful! This transfer semantic is pretty interesting and I guess does help solve the problem.

Here is one more example that I believe follows from yours:

req = new Request({body: someClosedReadableByteStream})
assert(req.body !== null)
assert_false(req.bodyUsed)
fetch(req) // success
assert(req.body === null) // due to transfer
assert_true(req.bodyUsed) // due to obsolete flag
assert_reject(fetch(req)) // due to obsolete flag

i.e., this is equivalent to {body: ""}.


However, I am afraid of the issue @yutakahirano pointed out in yutakahirano/fetch-with-streams#37 (comment):

req = new Request({ body: "Hello world!" });
req.body.cancel();

// offset is zero since nothing has been read, so
// now this is equivalent to the above case: req.body is just
// a closed, zero-offset RBS.

assert(req.body !== null);
assert_false(req.bodyUsed); // !!
fetch(req);                 // !! success
assert(req.body === null) // due to transfer
assert_true(req.bodyUsed) // due to obsolete flag
assert_reject(fetch(req)) // due to obsolete flag

This situation seems fine to me, but I am afraid @wanderview will not like it, since author code managed to disturb the stream in some way and yet fetch still succeeded/bodyUsed is still false.

@annevk
Copy link
Member

annevk commented Jul 1, 2015

Can we inspect whether a stream is canceled? I agree that scenario does not seem ideal.

@domenic
Copy link
Member Author

domenic commented Jul 1, 2015

No, canceled just means closed because you were told nobody was interested so you closed up prematurely.

@annevk
Copy link
Member

annevk commented Jul 1, 2015

(And yes, the 0 offset check could maybe be removed if we instead complicate the "is synthetic" check elsewhere.)

So a non-read canceled stream is indistinguishable from an empty stream. I guess that's just how it is then... Not sure there's anything we could do about that if that's the semantics the Streams folks think are logical...

@annevk
Copy link
Member

annevk commented Jul 1, 2015

(Note that I'm unclear what problem you think the transfer semantic is solving, as it has been there since the start. I have not touched that part of the Fetch specification for quite a while. I'm also happy to change it to something like @jakearchibald suggests so we can slowly move towards .body never being null which it seems at least some people want and HTTP is also steering towards.)

rakuco pushed a commit to crosswalk-project/blink-crosswalk that referenced this issue Jul 14, 2015
> [Fetch] Check / Set .bodyUsed in Request construction if the src body is null.
> 
> We need to check / set .bodyUsed in RequestConstruction when a Request is given,
> even if the Request has a null body. See [1] for details.
> 
> 1: whatwg/fetch#61
> 
> BUG=501195
> 
> Review URL: https://codereview.chromium.org/1187653003

TBR=yhirano@chromium.org

Review URL: https://codereview.chromium.org/1205053002

git-svn-id: svn://svn.chromium.org/blink/branches/chromium/2403@197739 bbb929c8-8fbe-4397-9dbb-9b2b20218538
@annevk
Copy link
Member

annevk commented Jul 22, 2015

bodyUsed needs to start using whatwg/streams#378 and we need to start using that where we check "used flag" today.

@annevk
Copy link
Member

annevk commented Jul 22, 2015

Actually, we'll do that as a part of a new issue for fetch-with-streams.

@yutakahirano
Copy link
Member

@annevk, I didn't realize that this issue should be addressed. Given that now bodyUsed relies on stream's disturbed property, having another flag sounds a little weird. What do you think about setting Request's body to a locked-closed-disturbed stream instead of null in the Request constructor?

@annevk
Copy link
Member

annevk commented Oct 20, 2015

@yutakahirano could you open a new issue with the scenario that is not covered and your proposed solution? I think that would be clearer and make it easier for everyone to follow what is going on. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants