Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possibly wrong handling of error_code? #87

Closed
dimaqq opened this issue Feb 27, 2020 · 7 comments · Fixed by #441
Closed

possibly wrong handling of error_code? #87

dimaqq opened this issue Feb 27, 2020 · 7 comments · Fixed by #441

Comments

@dimaqq
Copy link

dimaqq commented Feb 27, 2020

https://github.com/encode/httpx/blob/50d337e807839c21e796fd8b01c67d8a672a9721/httpx/_dispatch/http2.py#L147-L155

IIRC, RFC defines 2 messages that may carry error_code:

  • RST_STREAM - forcibly close 1 stream
  • GOAWAY - close entire connection (gracefully or forcibly)

In addition, h2 might generate ResetStream event internally:

when the remote party has made a protocol error which only affects a single stream

I find it weird that there's only this one place where error_code is evaluated in the library 🤔

@tomchristie
Copy link
Member

Indeed. I wonder if a more comprehensive tack onto this would be...

  • If the event is associated with a stream (Ie. bool(event_stream_id) evaluates as True.) then raise ProtocolError here as we're currently doing.
  • If the event is not assoicated with a stream then store the error on the connection class, raise it now, and also raise it on any interaction with existing streams.

Does that sound sufficient here?

@dimaqq
Copy link
Author

dimaqq commented Feb 28, 2020

GOAWAY

Kindof, with a caveat:

if the connection is being closed:

  • respond to that (close the socket, etc)
  • record the fact that connection is dead
  • [see below for this stream]
  • on future interaction for other streams:
    • allow reading out buffered data (from before the goaway frame)
    • then raise

For this stream: I doubt raising right away is correct, because socket.read() could have returned multiple events for same stream, and (subset of) previous events could satisfy the caller.

Thus, I think it's simpler to change the handling to:

if the event is not associated with a stream, then store the error, close the connection, and on future interactions that require socket io, then raise

What do you think?

RST_STREAM

I think the data model needs to change a bit, for example:

  • either self.events[stream_id] = [e1, e2, e3, sentinel]
  • or self.events[stream_id] = [e1, e2, e3] + self.stream_state[stream_id] = error_event

@dimaqq
Copy link
Author

dimaqq commented Feb 28, 2020

GOAWAY error code

The spec says:

... GOAWAY allows an
endpoint to gracefully stop accepting new streams while still
finishing processing of previously established streams. This enables
administrative actions, like server maintenance.

https://tools.ietf.org/html/rfc7540#section-6.8

I'm not an expert, but to me, this sounds like client should have 2 code paths:

  • GOAWAY with error code 0 (graceful termination):
    • continue sending and receiving data on existing streams (per Last-Stream-ID from GOAWAY)
    • can't create new streams
  • GOAWAY with an error code (non-zero, error):
    • use the data received prior to GOAWAY
    • don't transmit on existing streams, don't create new streams
    • close the connection

↑↑↑ That's just my reading, I could be wrong... In fact after reading the spec again, I'm not sure that non-zero goaway means terminate...

Ref for non-zero error code: https://tools.ietf.org/html/rfc7540#section-5.4.1

Perhaps h2 is meant to handle it instead, but it doesn't seem to make the distinction: https://github.com/python-hyper/hyper-h2/blob/570dc7daa480d34bcf0676e88d241112c51b1796/h2/connection.py#L1825-L1844 🤷‍♂

@dimaqq
Copy link
Author

dimaqq commented Feb 28, 2020

RST_STREAM error code

https://tools.ietf.org/html/rfc7540#section-8.1.4 contains this special case:

 The REFUSED_STREAM error code can be included in a RST_STREAM
 frame to indicate that the stream is being closed prior to any
 processing having occurred.  Any request that was sent on the
 reset stream can be safely retried.

So, handling should be different depending on error_code == 7 🦄
h2 includes error_code in the event, so we can see it :)

@florimondmanca
Copy link
Member

@dimaqq This looks like a weak spot on our end, yes… Have you seen any issues caused by this yet?

@dimaqq
Copy link
Author

dimaqq commented Mar 2, 2020

Maybe... I'm not sure.
In the past we've hit cases where connection was closed seemingly in response to auth failure in one of the requests.
Now, tbh, we've never traced that (I can't be sure stream was being reset), and I can't even be certain what library version we used and even if we were already using/trying httpx or if used plain h2.
In our case all requests were idempotent, so we just retried everything :)

It's possible that the cause was that RST_STREAM was being treated like GOAWAY and that being taken as hard connection shutdown. 🤷‍♂
The server was Apple's APNS, so it's not too easy to investigate or make an MRE. 😢

@dimaqq
Copy link
Author

dimaqq commented Mar 25, 2020

The default nginx HTTP/2 configuration is to limit number of requests served over a single connection and the default is merely 1000.

Ref: http://nginx.org/en/docs/http/ngx_http_v2_module.html#http2_max_requests

If someone runs nginx servers and doesn't change that setting, httpx performs poorly.

How nginx limits the requests per connection:

  1. nginx doesn't set max_stream_id in advance
  2. it sends <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:0, additional_data:None>

re: 1.: I think it really should inform client in advance how many streams are allowed, oh well.
re: 2.: it's strange that last_stream_id is set to 0, because that allows client to retry non-idempotent requests, it's really against the RFC

So, uhm, it would be nice to support nginx defaults, or limits in general, but it's also the case that httpx changes alone are not enough, nginx also needs changes and hyper-h2 needs a patch (python-hyper/h2#1181).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants