-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http, x/net/http2: http server shutdown doesn't gracefully shut down HTTP2 connections #39776
Comments
cc @fraenkel |
From what I can tell of the test case, the current behavior is correct.
You started a request and called shutdown in the middle. The server reported that this connection will no longer accept any new streams (GOAWAY). The client closed the connection (readFrame EOF). http2 has a different shutdown behavior which isn't documented anywhere. It is similar but it won't wait forever. |
@fraenkel thanks for your response. Could you please check one more time? I think that my test sends 25 requests, each waits for 60 seconds in the handler (after reaching the server) and only then it starts shutting down the server (not in the middle). I consider these requests to be active, being processed. Actually
https://github.com/golang/go/blob/master/src/net/http/h2_bundle.go#L4862 |
Hi @fraenkel The client does not seem to receive a We designed a test to simulate a scenario where the server has an active request, its handler is executing while the server
We kick off Graceful shutdown as soon as
The following events take place when we run the test:
T+0s E2: the server receives the request and starts executing the handler.
T+10s: the following happens:
The graceful termination process in http2 server connection We have added logs to net/http
The client on the other hand encounters an
If we wire Below we can see that the server sends a
The active stream is completed, the client gets a
|
I am using the simplified test above rather than yours from k8s. |
@fraenkel I wrote a test that doesn't use k8s, PTAL p0lyn0mial/kubernetes@81314a2 @fraenkel |
Hi @fraenkel @p0lyn0mial I have copy pasted the non k8s test below for reference. func TestHTTP2GracefulShutdown(t *testing.T) {
callbacks := map[string]http.HandlerFunc{}
handler := func(w http.ResponseWriter, req *http.Request) {
path := req.URL.Path
callback, ok := callbacks[path]
if ok && callback != nil {
callback(w, req)
}
w.WriteHeader(http.StatusOK)
}
waitCh := make(chan struct{})
callbacks["/request-1"] = func(w http.ResponseWriter, r *http.Request) {
<-time.After(10 * time.Second)
close(waitCh)
<-time.After(10 * time.Second)
}
server := httptest.NewUnstartedServer(http.HandlerFunc(handler))
server.EnableHTTP2 = true
server.StartTLS()
defer server.Close()
var shutdownComplete = make(chan struct{})
shutdown := func() {
<-waitCh
ctx, cancel := context.WithTimeout(context.Background(), 20 * time.Second )
server.Config.Shutdown(ctx)
close(shutdownComplete)
cancel()
}
call := func(wg *sync.WaitGroup, server *httptest.Server, path string) {
if wg != nil {
defer wg.Done()
}
client := server.Client()
resp, err := client.Get(fmt.Sprintf("%s/%s", server.URL, path))
if err != nil {
t.Fatalf("request failed, path=%s - %v", path, err)
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatalf("failed to read body, path=%s - %v", path, err)
}
t.Logf("status code: %v body %s", resp.StatusCode, body)
if resp.StatusCode != 200 {
t.Errorf("unexpected HTTP staus for request, path=%s: %v, expected: 200", path, resp.StatusCode)
}
}
defer server.Close()
go shutdown()
// make one call just to set things in motion and wait for some time
call(nil, server, "request-0")
<-time.After(10 * time.Second)
wg := sync.WaitGroup{}
wg.Add(1)
go call(&wg, server, "request-1")
wg.Wait()
<-shutdownComplete
} |
You are correct. The connection state is not being managed properly. I haven't quite figured out a good way to fix this yet. |
I do have a proper solution for this but its not backward compatible. This issue is that the http server sees the connection as New or Closed, the http2 server doesn't really care but fires ConnState events. A hook registered on the http server, sees the combined state events. Shutdown looks for idle connections, and will see http2 connections as new and will pretty much kill them even when active. One solution is to mark connections Active prior to calling any non-http handler. This would work except the httptest hook, doesn't like Active->Active transitions because the http2 server will fire an Active, Idle transition. Removing the Active event makes things work. This of course breaks backward compatibility given how the code is delivered. |
I have something that will work but I don't quite care for it. I will send it once I simplify the test case. |
Change https://golang.org/cl/240278 mentions this issue: |
@fraenkel I took a look at your patch. The solution is to mark the connections as |
They are never marked idle. They are marked closed. |
CL comment should also reference #36946 |
@fraenkel I see, it will transition to StateClosed once the graceful shutdown is completed. Many thanks for explaining and providing the fix. |
@fraenkel do you know if the fix will be back ported? |
I don't make the calls on back ports but this does qualify since it does not behave as documented and there is no workaround. |
cc @toothrot this needs milestone 1.15 or 1.16 and should be retitled "net/http: ..." Also needs backport to 1.13/14. |
@gopherbot please backport to 1.13 & 1.14. Per @fraenkel this does not behave as documented and there is no workaround. |
Backport issue(s) opened: #40085 (for 1.13), #40086 (for 1.14). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
/cc @bradfitz as the http package owner. |
@fraenkel I think this needs backports for impending minor releases of 1.13 & 1.14. |
@networkimprov the fix has yet to be approved |
Any idea why? Is it a "risky" fix? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
The backport issues were closed. Once fixed, new backport issues should be manually posted for 1.15 & 1.14. cc @odeke-em in case he could review the CL. |
SGTM, thank you @networkimprov for the tag and the triage gardening, I’ll
take a look later today.
…On Thu, Sep 3, 2020 at 2:32 PM Liam ***@***.***> wrote:
The backport issues were closed. Once fixed, new backport issues should be
manually posted for 1.15 & 1.14.
cc @odeke-em <https://github.com/odeke-em> in case he could review the CL.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#39776 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V533GNLYH3MUMGUQKLSEADOFANCNFSM4OFNVDXQ>
.
|
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, tested on
go version go1.14.4 darwin/amd64
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I wrote a simple unit test that sets up an
HTTP2
connection between a client and the server, sends a few requests and shuts down the backend server as soon as the last request arrives at the server. Since each request waits for60 seconds
in the handler I consider these requests to be active.I analyzed
net/http
package and here's what I have found out.On calling
http.server.Shutdown()
the server will be closed immediately because the connection is inStateNew
(not inStateActive
). In generalnet/http
package manages connection states and provides a hook (ConnState
) for reporting connections state. For HTTP 1.x it will transition the connection toStateActive
as soon as it reads any byte on the wire. For HTTP 2.xh2_bundle.go
will transition the connection toStateActive
once there is more than one active stream. But the issue here is thatConnState
hook is not wired. At least I haven’t found the wiring. So the server doesn’t know the current state of the connection.I changed
net/http
so that it provides a defaultConnState
method. It is used by the HTTP2 implementation to report back the connection state. With this modification, my test takes~60 seconds
. The server waits until all requests complete before shutting down and all requests complete with success (HTTP 200
).What did you expect to see?
According to the documentation, the
Shutdown
method must wait indefinitely for active connections to return to the idle state and only then shut down the server. In that case, I would expect all requests to complete with success (HTTP 200
) and to test take at least60 seconds
because each request simulates work by sleeping for that duration.What did you see instead?
My test ended after
~6 seconds
, all requests failed withGet "https://localhost:52066": http2: server sent GOAWAY and closed the connection; LastStreamID=49, ErrCode=NO_ERROR, debug="".
The text was updated successfully, but these errors were encountered: