Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: Server rejects Transport with ErrCodeProtocol stream error #47635

Open
bradfitz opened this issue Aug 11, 2021 · 6 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Aug 11, 2021

We're running a Go 1.17rc2 httputil.ReverseProxy in front of a Go go1.17rc2 net/http.Server.

We were seeing weird ErrCodeProtocol stream errors, so we stopped using the bundled x/net/http2 and switched both to using x/net/http2 explicitly (with http2.ConfigureTransports and http2.ConfigureServer, so the bundled one isn't used).

Our x/net git version is golang/net@aaa1db6, which is today's current master.

The logs we were getting weren't initially enough to tell us where the problem was:

2021/08/10 14:07:20 http: proxy error: stream error: stream ID 1293155; PROTOCOL_ERROR
2021/08/10 14:07:20 http: proxy error: stream error: stream ID 1293157; PROTOCOL_ERROR

But after hacking it up locally to add some expvar counters on which paths in http2.Server were returning the protocol error, we found it was here:

https://github.com/golang/net/blob/aaa1db679c0d7765d2b1cb1f92cac8ebf4d94c53/http2/server.go#L1835

// http://tools.ietf.org/html/rfc7540#section-5.1.2
// [...] Endpoints MUST NOT exceed the limit set by their peer. An
// endpoint that receives a HEADERS frame that causes their
// advertised concurrent stream limit to be exceeded MUST treat
// this as a stream error (Section 5.4.2) of type PROTOCOL_ERROR
// or REFUSED_STREAM.
if sc.curClientStreams+1 > sc.advMaxStreams {
        if sc.unackedSettings == 0 {
                // They should know better.
                return streamError(id, ErrCodeProtocol)
        }

So, the Go http2 client (Transport) and/or server is getting its concurrent/max streams accounting wrong.

Perhaps interesting: the backend (the http2.Server that the ReverseProxy's Transport is hitting) has extremely long-running http.Handlers (like: days or weeks longs) alongside many short ones. That might explain why this bug hasn't been reported by more people.

I see no recent commits that suggest this is a regression. This server is new, so I also can't say whether it worked previously.

/cc @neild @bcmills @fraenkel @dmitshur @dsnet @josharian

@fraenkel
Copy link
Contributor

possible? #42777

@bradfitz
Copy link
Contributor Author

Ah, yes. That's probably what we're hitting.

@dmitshur dmitshur added this to the Backlog milestone Aug 13, 2021
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 13, 2021
@dmitshur
Copy link
Contributor

CC @neild.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/347033 mentions this issue: http2: make Transport not reuse conns after a stream protocol error

gopherbot pushed a commit to golang/net that referenced this issue Sep 3, 2021
If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#47635

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/356978 mentions this issue: [internal-branch.go1.16-vendor] http2: make Transport not reuse conns after a stream protocol error

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/357673 mentions this issue: [internal-branch.go1.17-vendor] http2: make Transport not reuse conns after a stream protocol error

gopherbot pushed a commit to golang/net that referenced this issue Oct 29, 2021
… after a stream protocol error

If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#49077

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-on: https://go-review.googlesource.com/c/net/+/357673
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
gopherbot pushed a commit to golang/net that referenced this issue Oct 29, 2021
… after a stream protocol error

If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#49076

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-on: https://go-review.googlesource.com/c/net/+/356978
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
dteh pushed a commit to dteh/fhttp that referenced this issue Jun 22, 2022
If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#47635

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
fedosgad pushed a commit to fedosgad/oohttp that referenced this issue Jun 22, 2022
… after a stream protocol error

If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#49077

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-on: https://go-review.googlesource.com/c/net/+/357673
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants