-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd3 fails to shut down on sigint/sigterm when grpc-gateway has active watches #8224
Comments
Yes, this is expected since we use https://godoc.org/google.golang.org/grpc#Server.GracefulStop. The server will block on Same happens with clientv3 e.g. wch := cli.Watch(context.Background(), "foo")
<-wch |
@gyuho Is there some way the server can stop the client listener and cut the connections instead of waiting for all clients to disconnect on their own? If clients can keep the server from shutting down indefinitely and there's no way to force the clients to change endpoints, then upgrades would have to kill -9 the etcd process (or jump through other hoops to move the clients) to make forward progress, which seems bad. |
@heyitsanthony You are right. I will look into that. |
I agree that it's generally a bad idea for servers to fail to respond to a shutdown command just because clients are connected. There's a security component to this as well, as @heyitsanthony mentioned with updates. If a client can prevent a server from shutting down, just by staying connected to it, it could effectively stop a security update. James |
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections), when unary RPCs only take a few seconds to finish. Stream RPCs like watch might never close the connections in client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address etcd-io#8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections). Unary RPCs only take a few seconds to finish. Stream RPCs, like watch, might never close the connections from client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address etcd-io#8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections). Unary RPCs only take a few seconds to finish. Stream RPCs, like watch, might never close the connections from client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address etcd-io#8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections). Unary RPCs only take a few seconds to finish. Stream RPCs, like watch, might never close the connections from client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address etcd-io#8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections). Unary RPCs only take a few seconds to finish. Stream RPCs, like watch, might never close the connections from client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address etcd-io#8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Both grpc.Server.Stop and grpc.Server.GracefulStop close the listeners first, to stop accepting the new connections. GracefulStop blocks until all clients close their open transports(connections). Unary RPCs only take a few seconds to finish. Stream RPCs, like watch, might never close the connections from client side, thus making gRPC server wait forever. This patch still calls GracefulStop, but waits up to 10s before manually closing the open transports. Address #8224. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Running a single-node cluster from the shell using 3.2.0-rc.1+git. I have a local client connect to the JSON grpc-gateway using POST /v3alpha/watch in "streaming" mode, where the HTTP session is long-term and each new watch event comes over the wire as a new JSON-encoded HTTP chunk. This part works fine. However if I try to ctrl-c the etcd process while the watch is still active, the process hangs in a loop with the below errors. The only way to get it to terminate is either sigkill or terminating the watch client.
^C2017-07-06 17:02:31.710640 N | pkg/osutil: received interrupt signal, shutting down... 2017-07-06 17:02:31.710981 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: getsockopt: connection refused"; Reconnecting to {127.0.0.1:2379 <nil>} 2017-07-06 17:02:32.711047 I | etcdserver/api/v3rpc: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: getsockopt: connection refused"; Reconnecting to {127.0.0.1:2379 <nil>}
The text was updated successfully, but these errors were encountered: