Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading: how to handle redirects #66

Closed
domenic opened this issue Aug 2, 2016 · 41 comments
Closed

Uploading: how to handle redirects #66

domenic opened this issue Aug 2, 2016 · 41 comments
Assignees

Comments

@domenic
Copy link
Contributor

domenic commented Aug 2, 2016

I see two options:

  • Do as the spec does now, and tee the stream so we can handle redirects. A bit complicated...
  • Only allow uploads if the redirect mode is "error".
    • Automatically set it to "error" if they supply a stream, or require them to opt-in?
    • Should we allow "manual" also?

In practice I think disallowing upload redirects is not that bad. Redirects for uploads seems pretty weird to me... We could also do that as an initial version and then if there is developer need, loosen the restriction later?

@annevk
Copy link

annevk commented Aug 4, 2016

It's not just redirects. It's also authentication. You'd have to set window to null too. There's a note around the place where we tee streams that implementations could optimize for the no-window/no redirects scenario.

@annevk
Copy link

annevk commented Aug 4, 2016

How complicated will teeing be by the way?

(It's annoying that if you want to reduce memory usage you have to pass redirect:"error" and window:null. Ideally we have something simpler for that. But perhaps requiring those to be passed when you pass in a stream is okay for a v1.)

@domenic
Copy link
Contributor Author

domenic commented Aug 4, 2016

Can you explain the window thing? I am not familiar with that part of fetch and what it impacts.

I think teeing is probably not that complicated from a spec perspective. But it might be complicated from an implementation perspective, or at least, complicated to optimize. @yutakahirano had more thoughts on this.

And yeah, I mainly wanted to see how people felt about making v1 a bit more annoying. If we can figure out a way to avoid that (e.g. teeing) that's even better.

@annevk
Copy link

annevk commented Aug 4, 2016

It impacts HTTP authentication dialogs. A user agent can redo a request when given a 401/407 response. Setting window to null means there's no place to display the dialog and therefore the request cannot be redone.

@annevk
Copy link

annevk commented Aug 4, 2016

@yutakahirano
Copy link
Owner

Sorry for the late response.

  • Do as the spec does now, and tee the stream so we can handle redirects. A bit complicated...
  • Only allow uploads if the redirect mode is "error".
    • Automatically set it to "error" if they supply a stream, or require them to opt-in?
    • Should we allow "manual" also?

If it's possible to specify when the stream can be send-able without teeing, I would prefer the second option, at least for now. If users want to send a request with RS without such limitation, we can consider the other option at that time.

@annevk
Copy link

annevk commented Aug 13, 2016

Do you think we can? E.g., what about 401/407?

@yutakahirano
Copy link
Owner

Hmm, looks impossible.

Talked with @tyoshino.

  • Teeing all the data until the response arrives is bad for long (or endless) streams.
  • Authentication is crucial. Developers need to be able to upload a stream with authentication.
  • We need to provide a server with a way to respond with seeing all the request body data.

Can we propose a new HTTP status code (say 102) which means there is no more redirect and authentication (and more?)? IIUC 100 is not enough. Do you think it's a good idea? How hard is it to propose such a status code to IETF?

@annevk
Copy link

annevk commented Aug 19, 2016

@mnot @reschke I believe you have experience with registering status codes. 😊

The idea would be, if I understand correctly, that we don't stream the request body until the server guarantees with a 102 that the eventual response won't be a redirect or authentication dialog. If it doesn't send a 102 we'll just not transmit the body. If it does send a 102 and then doesn't comply it's a network error.

We might also offer the more expensive tee-all-the-things option, but the above would be much better as a default.

@domenic
Copy link
Contributor Author

domenic commented Aug 19, 2016

I think I would prefer to tee all the things, or disallow authentication. Causing a roundtrip tax on all from-JavaScript uploads just because a small minority of cases might need authentication seems like a bad tradeoff.

@annevk
Copy link

annevk commented Aug 19, 2016

If the server transmits early there would not be much of a tax, if at all. And teeing is a pretty big tax.

@wanderview
Copy link

wanderview commented Aug 19, 2016

Sorry if I missed this, but how does the server respond with anything before the upload body is sent?

Edit: Is this possible with chunked encoding?

@yutakahirano
Copy link
Owner

The UA can stop teeing after it sees 102 even with the "tee-all-things option".

@yutakahirano
Copy link
Owner

@wanderview I imagined something like 100 Continue. The server can send some information before sending the final response.

@reschke
Copy link

reschke commented Aug 20, 2016

It's not that complicated to define a new status code (but 102 is taken, btw, see http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml)

However, I'm both skeptical about whether a code is needed, and the likeliness of a server to get this right.

Is there a summary of what problem this would solve? How likely are large uploads that happen before any other request that would trigger redirects or auth?

Finally, how would the new status code be different from 100 (Continue)?

@domenic
Copy link
Contributor Author

domenic commented Aug 20, 2016

Having thought about this more over the last few days, I maintain that it's not really worth working on this feature if it requires server-side opt in. All the developers we've heard from want to use this uploading functionality on preexisting servers, in most cases outside their control.

So while we could consider minting a new status code (or reusing 100) as some kind of optimization for those that control both sides of the pipeline, I think the initial version of this feature really can't make that a requirement. So either it needs to tee or it needs to fail on redirects and on authenticated endpoints.

@yutakahirano
Copy link
Owner

@reschke Thank you, then let's use 199 in this thread. The problem is that we need to copy the request body to resend it in case redirect / authentication is required until the response arrives. It leads to a memory bloat. IIUC 100 is not enough - the server can serve 307 after sending 100.

Even when the server doesn't trigger redirects, the client cannot know that until it receives the final response. If the server wants to see the request body to decide the response code (200 or 404, for example), the client needs to remember the entire body.

@annevk
Copy link

annevk commented Aug 22, 2016

All the developers we've heard from want to use this uploading functionality on preexisting servers, in most cases outside their control.

This is not true. All the folks wanting to use this to replace their WebSocket stacks have significant control over both ends of the stack and have certainly made noise about wanting this.

Furthermore, we have tried to make the default API for fetch() to be the one that consumes the fewest resources. It would be nice if the least complicated API for uploading was also the most efficient one.

We could still ship with a less efficient model as a first step, but not until we know what the more efficient model would look like so we can preserve the aforementioned aspects of the API.

I tried to think about the 199 model a bit and I'm not entirely sure it helps. Ultimately the client needs to make a decision here about memory and latency.

I think a better model would be that the client requires 199-like semantics or explicitly decides to tee.

If the client requires 199-like semantics, the client can preserve memory usage and start sending the body directly (reducing latency; disregard my earlier comment on this). If the server ends up sending a 401/407/redirect, then we result in a network error. The client should have figured out somehow with the server beforehand that the 199-like semantics can be required. 407 (aka a porxy) makes this tough, but so be it.

API-wise I think it would make the most sense if the explicit teeing instruction was bound to the stream that is passed to fetch() in some way, but we could also add a flag to the fetch() API.

I'm not entirely sure how it should interact with redirect mode. I'm somewhat leaning towards ignoring it in the default non-teeing case and not requiring it to be set to "error" or "manual".

@tyoshino
Copy link
Contributor

Let me summarize proposals as usual.

I'd define a new term here, "resource status", that is statuses that don't require any User-Agent level action such as authentication handling, redirect handling, etc. and therefore are delivered to the application layer. In #66 (comment), Yutaka said 100 is not good. That's because non resource status may follow 100. Only resource status follows 199.

UA could start sending body

  • (a) immediately following the header
  • (b) once the client receives non resource status. No tee.
  • (c) once the client receives 199. No tee.

I understand Julian's comment (#66 (comment)) but not sure if it's reliable that we issue preflight for completing auth given load balancer, etc. So, I'm feeling that we should have (a-2) (and (a-3)), not only (a-1).

(a-2) and (b) requires the server to decide what status code to return before receiving all the body data. If it's impossible, the server could first send 200 and deliver the actual status code later in application level by encoding the status into the body. This might be related to the HTTP trailer API discussion.

(a-3) and (c) requires a new status code and server's opt-in. Solves #66 (comment). The client needs to declare capability of 199 ("Expect: 199"?).

(b) and (c) makes latency and memory consumption of fetch() competitive to WebSocket protocol/API. (a-1) gives us better latency than WebSocket.

Initially I was feeling that the default behavior for streams should be (2) (and (3)) for consistency with other data types. But I don't have strong opinion now.

So, the options would be

  • ""
    • default
    • optimized as (a-2) (a-3)
  • "send-body-when-ready"
    • (b) and (c)
  • "send-body-without-tee"
    • (a-1)

199 support can be added later to enable (a-3) and (c). The client would then start sending "Expect: 199".

@reschke
Copy link

reschke commented Aug 22, 2016

FWIW, we removed "Expect" from HTTP in RFC 7231 except for the already defined 100-continue.

Also, for the server to be able to send a new 1xx code, the client does not need to opt-in. If sending the 1xx breaks the client, the client needs to be fixed.

@annevk
Copy link

annevk commented Aug 22, 2016

@reschke for the "send-body-when-ready" option you need client opt-in of some kind (otherwise the server does not know it needs to transmit an early reply), but maybe we could define a Fetch-specific header for that, since either a new 1xx or an early status code from the server would do.

@reschke
Copy link

reschke commented Aug 22, 2016

What keeps the server from always sending the new 1xx code?

Anyway, this design with client opting asking for the code, and server opting in seems to be identical to the one that was tried before with 100 Continue and which didn't work well in practice -- mainly because the server stacks didn't allow the applications to properly do the 1xx handling (there was a long servlet API spec discussion about that).

@annevk
Copy link

annevk commented Aug 23, 2016

The server always sending it would be redundant with the case where the client transmits without teeing or the server can transmit a response status early.

Anyway, it seems we have enough of an idea without a status code so we could consider that again later.

(Pointers to previous discussions welcome.)

@yutakahirano
Copy link
Owner

yutakahirano commented Aug 25, 2016

What do you think about decoupling the discussion?

  1. Add a means to create a Request object with ReadableStream to the fetch spec. The request body will be teed until the response status is available ((a-2) in Uploading: how to handle redirects #66 (comment)).
  2. Talk about other options ("send-body-when-ready" (b), "send-body-without-tee" (a-1) in Uploading: how to handle redirects #66 (comment)) and how to give such options.
  3. Talk about 199 ((a-3) and (c) in Uploading: how to handle redirects #66 (comment)).

They can be done in parallel.

@annevk
Copy link

annevk commented Aug 25, 2016

I think we should consider "send-body-without-tee" and "" (aka "send-body-with-tee") at the same time, as the former should have the simpler API due to it consuming less resources and therefore being preferable. We don't necessarily have to ship the simpler API first, but we should have it figured out. Everything else can wait.

@yutakahirano
Copy link
Owner

Hmm, OK. I think it's good to choose "send-body-with-tee" as the default option, because it's the easiest one to use of three proposed options (Note: I changed my mind from #66 (comment) assuming we will be able to rely on some optimization such as 199 in the future).

@annevk
Copy link

annevk commented Aug 26, 2016

I think if we do teeing by default we're not being consistent. Elsewhere we require clone() to be invoked to get duplication and an increase in memory usage. I think we should have something similar here. Or at least force it as a choice somehow.

@domenic
Copy link
Contributor Author

domenic commented Aug 26, 2016

I think if we do teeing by default we're not being consistent.

Hmm, can you explain? When we stream uploads that are not created by page authors, we tee, and get the duplication and increase in memory usage. If we're shooting for consistency, I think being consistent with this platform behavior is the most important factor.

As @yutakahirano says, that is the easiest to use. That is presumably why we automatically tee in those cases. I don't see why it should be different for author-created streams.

@annevk
Copy link

annevk commented Aug 26, 2016

Whenever we had the option so far, we opted for better performance. I think we should continue to aim for that.

Could be interesting to do some telemetry to see how often teeing by default is wasted/used.

@giveme6months
Copy link

Have to agree

@yutakahirano
Copy link
Owner

Sorry for the late response.
Thank you, then I am updating #66 (comment):

  1. Add a means to create a Request object with ReadableStream to the fetch spec. The fetch will be terminated when redirect or authentication are required ((a-1) in Uploading: how to handle redirects #66 (comment)).
  2. Talk about other options ("" (a-2) (but let's say "send-body-with-tee"), "send-body-when-ready" (b) in Uploading: how to handle redirects #66 (comment)) and how to give such options.
  3. Talk about 199 ((a-3) and (c) in Uploading: how to handle redirects #66 (comment)).

@domenic
Copy link
Contributor Author

domenic commented Sep 20, 2016

Notes for F2F discussion:

  • What happens if you get a 301 (etc.)?
  • What happens if you get a 401/407?

Possible solutions:

  • Error the upload on 3xx/4xx. Two sub-options:
    • Automatically opt-in to these semantics when body is a stream
    • Require that whenever you set body to a stream, you also set these options (redirect mode = error, authentication mode = error)
  • Tee the stream, send one branch up, wait until response arrives before deleting other branch
  • New status code (199)
    • Allows server to say "throw away the second tee branch", I think

@jakearchibald
Copy link

F2F:

  • Only fail if fetch need to replay the stream & can't
  • You opt into this behaviour by providing a stream
  • If this fails, there's no programmatic way of detecting why (connection fail vs 307 received)

@annevk
Copy link

annevk commented Sep 20, 2016

Also, we decided not to provide opt-in for automatic handling of redirects / authentication right now, and not do a new status code. We can do those as needed later on.

@annevk
Copy link

annevk commented Sep 20, 2016

Also, we decided to allow redirects for 303, since that drops the body and changes the method to GET reliably. (Doing the same for 301/302 when the method is POST could also be done, but seems somewhat less clean.)

@yutakahirano
Copy link
Owner

So the conclusion is that the first release should be (a-1) in #66 (comment), right?

@annevk
Copy link

annevk commented Sep 27, 2016

@yutakahirano yes (with 303 as an exception since it has simple semantics).

@yutakahirano
Copy link
Owner

Thanks.

@jakearchibald
Copy link

@annevk

Doing the same for 301/302 when the method is POST could also be done, but seems somewhat less clean

Isn't the rule just "if the body is needed a second time, it fails"?

@annevk
Copy link

annevk commented Oct 3, 2016

The problem is that what browsers do for 301/302 is somewhat against the spirit of HTTP. So further enshrining it here does not seem great.

@annevk
Copy link

annevk commented Jan 17, 2017

This can now be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants