Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request body streams that cannot be retried #538

Closed
annevk opened this issue May 8, 2017 · 63 comments · Fixed by #1204
Closed

Request body streams that cannot be retried #538

annevk opened this issue May 8, 2017 · 63 comments · Fixed by #1204

Comments

@annevk
Copy link
Member

annevk commented May 8, 2017

Currently our design is such that request body streams cannot be retried and that any redirects (other than 303) and HTTP authentication end up failing the request. This is rather nice for memory usage, but according to @ekr this plays badly with TLSv13 and QUIC which both assume requests can be retried.

Sigh.

cc @jakearchibald @yutakahirano @tyoshino @domenic @mnot

@wanderview
Copy link
Member

wanderview commented May 8, 2017

Can't the network stack use Request.clone() or Request body stream tee() if they are in a re-playable scenario?

Edit: From an implementation side of things we are already doing this in gecko. For example, hitting back and providing the option to re-post data requires this.

@annevk
Copy link
Member Author

annevk commented May 8, 2017

@wanderview part of how we specified this feature is to allow the network stack to not do that in order to optimize resource allocation. I know it's generally done, but specifically for fetch() with streams we wanted it to not do that to preserve memory. It seems however we might need to do it anyway to play nice with TLS and QUIC at which point we might as well start following redirects and work with HTTP authentication too.

@domenic
Copy link
Member

domenic commented May 8, 2017

I'd like to hear more about this TLS and QUIC feature, as it seems strange to me that such protocols would require replaying the entire body.

@ekr
Copy link

ekr commented May 8, 2017

Well, it's just a general property of networking protocols that you can have data in flight and then the other side sends you some message that causes the connection to fail and you have to retry. 0-RTT is just the most recent case where that crops up.

@yutakahirano
Copy link
Member

Is there any (fixed) upper bound for the amount of the in flight data?

@annevk
Copy link
Member Author

annevk commented May 8, 2017

Given https://groups.google.com/d/msg/mozilla.dev.tech.network/glqron0mRko/QMBjliPVAQAJ and reply it seems that indeed this can be constrained by the implementation. So for a 10GiB request body the implementation could just use back pressure after 100KiB or so to make sure it succeeds and otherwise retry with those 100KiB.

It might be worth adding a note about this. The re-post-on-back scenario from @wanderview can't be hit by this feature I think since that's just <form>.

@ekr
Copy link

ekr commented May 8, 2017

Even ignoring 0-RTT, there's not really any limit on retry.

Consider the case where you are doing a huge POST. The amount of data you have in flight at any given time is dictated by the usual transport mechanisms, but the bottom line is that the server can send you a TCP RST half-way through, and at that point the HTTP stack can either retry or not, but you're basically in a posture where you have a network error and you need to do something.

@annevk
Copy link
Member Author

annevk commented May 8, 2017

Sure, but in that case not retrying seems like a reasonable default for the streams API. That is to say, to make it an application concern (application being the web site).

@ekr
Copy link

ekr commented May 8, 2017

I think you will find that that causes a surprising level of failures.

@mcmanus may have some thoughts here.

@tyoshino
Copy link
Member

Does this mean that protocols layered over QUIC need to be prepared for retransmission as well as TCP is, and HTTP is now layered on top of QUIC with preparation for such necessary retries in the granularity of HTTP transactions (or a little smaller units), not bytes/packets like TCP?

@mfalken
Copy link

mfalken commented Nov 7, 2017

Just making sure, is the current spec saying that if a service worker does e.respondWith(Response.redirect(...)), the body in the redirected request is null? It looks like that's Chrome's current behavior, and it's quite convenient as it means we don't need to store the request body somewhere before sending it to the service worker.

@wanderview not sure if Firefox has similar behavior, or if we have any WPT tests about this.

@annevk
Copy link
Member Author

annevk commented Nov 7, 2017

@mattto that's not intentional if it actually says that (and worthy of a separate issue). That would mean that using a service worker instead of a server changes the behavior. While we've accepted that for some cases, we have not for this one.

@mfalken
Copy link

mfalken commented Nov 9, 2017

@annevk What if the service worker reads event.request.body and then does a redirect? Is the redirected request expected to have the same body?

@annevk
Copy link
Member Author

annevk commented Nov 9, 2017

@mattto I guess I'd be okay with that being different. We haven't defined this and thought through it in sufficient detail. See also #609 by the way about additional actions to be taken when you drop the body during a redirect, though it's unclear if that should be applicable here.

@yutakahirano
Copy link
Member

cc: @yoichio

Sorry for the long silence.

I think you will find that that causes a surprising level of failures.
@mcmanus may have some thoughts here.

Is this issue addressed by providing error details for web developers so that they can retry?

@annevk
Copy link
Member Author

annevk commented Nov 7, 2019

Per @ddragana letting applications handle the error should be okay. We still need some language for 0RTT though as per #538 (comment). It's a bit unfortunate there's no testing infrastructure for that as that's probably most important to get this right.

@ddragana
Copy link

ddragana commented Nov 7, 2019

cc: @yoichio

Sorry for the long silence.

I think you will find that that causes a surprising level of failures.
@mcmanus may have some thoughts here.

Is this issue addressed by providing error details for web developers so that they can retry?

If a error is return developers can decide to retry it or not.
Do we want to automatically retry request in the http stack without surfacing an error, I would say no. Some requests like POST are not safe to retry by definition. (Firefox will not send POST request during 0-rtt)(although http stacks already do retry request automatically in some cases due to network error)

@yoichio
Copy link
Contributor

yoichio commented Nov 11, 2019

We need to spec like "For 0RTT, we should hold send buffer until server responds it arrived", right?

@annevk
Copy link
Member Author

annevk commented Nov 11, 2019

Yeah, and also apply back pressure after a given amount of KiB has been buffered so the buffer doesn't have to be too big. Also, we probably need to dig up the formal terminology and reference the relevant RFCs.

@annevk
Copy link
Member Author

annevk commented Nov 11, 2019

cc @whatwg/http

@MattMenke2
Copy link
Contributor

MattMenke2 commented Nov 11, 2019

Bodies already have to be replayed on certain redirects, as comment #1 notes. Chrome also replays bodies on certain errors when sending a POST on a reused socket (In the case of errors that could mean the server timed out the socket racily with our decision to reuse of the socket) - I had assumed other browsers do that as well, or only use fresh sockets with POSTs, which doesn't seem to work with H2.

Chrome's blob code allows upload bodies to be kept as on-disk objects and streamed as uploads from there. I'm not sure if this applies to all web-initiated uploads, just the ones that explicitly create and upload a blob, or any upload with a sufficiently large body.

@annevk
Copy link
Member Author

annevk commented Nov 11, 2019

@MattMenke2 to give some context. We can only upload "blobs" today and those can easily be replayed. For uploading streams we want a design that's minimal in terms of memory usage. So if you upload 1TiB, it shouldn't also try to buffer 1TiB. Therefore, as OP notes, the plan is to fail on redirects and various other known cases that require a replay. The question is if such replay can always be avoided, reduced to only a fixed 100KiB buffer or so, or result in error, or if uploading streams without also buffering everything is doomed to fail for one reason or another.

@MattMenke2
Copy link
Contributor

Thanks for the context! I didn't realize that this was for a streaming upload API, since the term "stream" is rather ambiguous - upload bodies are always streamed to the server over a socket.

The only chunked uploads Chrome supports are internal requests for its speech-to-text API, which does currently buffer the entire upload body for replay, but that certainly doesn't seem like a behavior we'd want exposed to the web.

@yoichio
Copy link
Contributor

yoichio commented Nov 13, 2019

Or given that Firefox doesn't send POST request during 0-RTT, if Chromium does same(@MattMenke2 ?), how about specifying that so we can avoid to retry ReadableStream for 0-RTT?

@mnot
Copy link
Member

mnot commented Nov 13, 2019

WRT not sending POST during 0RTT, see RFC8470 - Using Early Data in HTTP:

Absent other information, clients MAY send requests with safe HTTP methods ([RFC7231], Section 4.2.1) in early data when it is available and MUST NOT send unsafe methods (or methods whose safety is not known) in early data.

@mcmanus
Copy link

mcmanus commented Nov 20, 2019

you really ought to make OPTIONS 0-rtt safe.. its specifically called out by https://tools.ietf.org/html/rfc7231#section-4.2.1 as safe to replay, is commonly the first cxhg on a connection due to cors preflights, and as a practical matter doesn't have a body that is hard to buffer.

@annevk
Copy link
Member Author

annevk commented Nov 20, 2019

@mcmanus that's a good point, though the logic would end up a bit more complicated as OPTIONS with a request body shouldn't be 0RTT safe I think. So at that point 0RTT-safe would become a property of the request, not the request method.

@mcmanus
Copy link

mcmanus commented Nov 20, 2019

its safe with the body..

Of the request methods defined by this specification, the GET, HEAD,
OPTIONS, and TRACE methods are defined to be safe.

@annevk
Copy link
Member Author

annevk commented Nov 20, 2019

Sure, but for the purposes of upload streams we're then back to the same questions we were at with POST, in that we don't want to support replays of arbitrary lengths. And it doesn't seem great either to make distinctions based on the type of request body.

@mcmanus
Copy link

mcmanus commented Nov 21, 2019

I get it - but its an important use case given how preflight works and the fact that OPTIONS almost never actually has a body.. it would seem like an own-goal to give up 0-RTT for concern over the extreme corner case that presented a problem.

its been a while since I've read the streams spec, but surely there is some form of backpressure in it? (blocking, or flow control credits, or ???..) And surely there is implicitly some form of state requirement for buffering already needed given that you are willing to replay the headers.. and compared to any OPTIONS body I've ever seen (which are rare to start with) the upper end of the header size is more than enough for the OPTIONS body.

can't you just buffer a K or 2 and then let backpressure kick-in pending handshake notificiation?

@yoichio
Copy link
Contributor

yoichio commented Nov 21, 2019

If it would be long, should we split the 0-RTT safeness issue (which actually is not directly related to Stream so fork it to new thread?, I think), and limiting retry buffer for stream (continue here) ?

@annevk
Copy link
Member Author

annevk commented Nov 21, 2019

@mcmanus yeah, something like that is what I suggested in #538 (comment) (with perhaps too big a budget) and then this whole thread happened. It might still be reasonable though and I wouldn't mind it.

@yoichio isn't it retrying for early connection drops and retrying for 0RTT fundamentally the same issue?

@yoichio
Copy link
Contributor

yoichio commented Nov 22, 2019

@annevk If retrying here means retry if early connection drops on 0-RTT and buffering Stream is only needed for that, it is the same issue. and If the early connection request should not contain Stream body, the buffering is no longer needed. Right?

@yoichio
Copy link
Contributor

yoichio commented Apr 22, 2020

Since It's been a while, let me summarize the discussion so far:

Redirect

fetch() fails with promise-error like "Redirect was asked but Stream can be replayed"

Other cases need a stream replayed

U.A. might have a cache of the streams up to a few KiBytes from implementation restriction and for 0-RTT resend.

@annevk
Copy link
Member Author

annevk commented Apr 22, 2020

Redirects and server-initiated authentication prompts. And the rejection is with the normal TypeError with perhaps a console-only message of the reason (which is "cannot be replayed" btw). We should not expose the failure reason to web content.

And with respect to the cache I think that's a must and ideally we specify an upper bound on the budget so it doesn't become a race to the bottom.

@yoichio
Copy link
Contributor

yoichio commented Apr 27, 2020

Redirects and server-initiated authentication prompts. And the rejection is with the normal TypeError with perhaps a console-only message of the reason (which is "cannot be replayed" btw). We should not expose the failure reason to web content.

I see. BTW, generally what makes a different between showing on console or exposing API for a "error"?

And with respect to the cache I think that's a must and ideally we specify an upper bound on the budget so it doesn't become a race to the bottom.

"race to the bottom" means just caching a whole stream that makes Streams API memory-inefficient?
If so, I agree specifying an upper bound.
@MattMenke2 do you know how much KiBytes we need in Chrome to replay a stream on some inevitable situation like 0RTT, socket time out?

@annevk
Copy link
Member Author

annevk commented Apr 27, 2020

BTW, generally what makes a different between showing on console or exposing API for a "error"?

Generally we don't let sites learn about network errors so they do not get information about the user's environment or servers. Putting this in the console is harmless (although with Spectre...). Some of these security decisions have been upended by various Performance WG efforts and I haven't really tried to sort those out (and unfortunately neither have they), but until someone has done the work to define the contract better about what is okay to expose I rather hold the line.

"race to the bottom" means just caching a whole stream that makes Streams API memory-inefficient?

Right, a big reason for offering this is allowing minimal memory usage.

@MattMenke2
Copy link
Contributor

@MattMenke2 do you know how much KiBytes we need in Chrome to replay a stream on some inevitable situation like 0RTT, socket time out?

I don't know that, unfortunately. We've always cached the entire thing in Chrome, which means we can handle redirects that require we re-send the body, though I have no idea if our single consumer even needs that.

@annevk
Copy link
Member Author

annevk commented Apr 27, 2020

It does need that (and that's specified in Fetch as well).

@yoichio
Copy link
Contributor

yoichio commented Apr 28, 2020

We've always cached the entire thing in Chrome, which means we can handle redirects that require we re-send the body,

Chrome needs to change that part at least on the decision that U.A. should fail to resend a ReadableStream on HTTP 30X response.

Wait. 0RTT also needs server HTTP response to resend the request when the first authentication was failed?
AFAIK, Chrome doesn't process server response before sending entire request body (right? @MattMenke2 )
Then, resending request on 0RTT failed needs entire body copy like redirect in Chromium.

@MattMenke2
Copy link
Contributor

Correct, Chromium only starts reading the response once the entire body has been sent.

@benbucksch
Copy link

benbucksch commented Jun 15, 2020

HTTP streaming has become so useful that the word "streaming" has became a house hold word. Transmissing audio or audio/video live, or radio shows, or streaming a TV tuner signal to a playback device.

The HTTP semantics of a single back/forward function call are very useful and clearly superior to other out of band protocols like FTP or SIP/RDP. FTP is dead mainly for that reason. And RDP is just as problematic: If you've ever tried to use a SIP softphone, you know what I'm talking about.

What we need is essentially the same what we call HTTP streaming, but as PUSH instead of PULL: The source of information initiates the connection, not the target. Still, it's streaming live.

Imagine a one-direction audio call, initiated by the sender, transmiting to a specific target. Let's say a baby phone, where the sender opens the connection as soon as it detected activity, and sends it to a specific target URL. But the stream is live and ongoing.

So, a retry is inherently impossible, given that it's live data.

@yoichio
Copy link
Contributor

yoichio commented Jun 15, 2020

We chromium are discussing a partial cache mechanism here.

@jakearchibald

This comment has been minimized.

@annevk

This comment has been minimized.

@jakearchibald

This comment has been minimized.

@annevk
Copy link
Member Author

annevk commented Apr 15, 2021

I wanted to leave a comment here as quite a bit of time has passed. There's a PR up now at #1204 which implements #538 (comment). In particular, for request streams (i.e., streams API is being used) we allow a 64 kibibyte buffer for replays in the networking stack. Once that buffer is exceeded, the only recourse to having to resend is to terminate with a network error.

We're not saying anything specific about 0RTT and I don't think we need to as @ddragana said above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

15 participants