-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: make websocket ping interval shorter than pong timeout #27726
Conversation
Hmm. So initially, I thought, this is obviously a correct fix for an issue. But upon further investigation, I wonder if the issue is even real. We handle ping as follows: When no message is sent for 30s (wsPingInterval), a ping frame is sent. We then wait for a response from the peer within another 30s (wsPongTimeout). Are you sure this issue is related to these constants? Did you experience such a disconnect in practice? |
When I make 5000-10000 connections to Geth without sending any messages, disconnecting appears in a few minutes. And it would not disconnect after changing these constants. I think the reason is that the following two places execute concurrently. If Line 288 in 99e000c
Line 354 in 99e000c
|
Alternative patch #27733 Please try. |
I submitted an alternative patch because I think it's a logic race. Changing the timeouts will make it less likely, but the race is still there. |
I'm closing this in favor of the alternative patch. |
This should fix #27726. With enough load, it might happen that the SetPongHandler callback gets invoked before the call to SetReadDeadline is made in pingLoop. When this occurs, the socket will end up with a 30s read deadline even though it got the pong, which will lead to a timeout. The fix here is processing the pong on pingLoop, synchronizing with the code that sends the ping.
This should fix ethereum#27726. With enough load, it might happen that the SetPongHandler callback gets invoked before the call to SetReadDeadline is made in pingLoop. When this occurs, the socket will end up with a 30s read deadline even though it got the pong, which will lead to a timeout. The fix here is processing the pong on pingLoop, synchronizing with the code that sends the ping.
With the current ping/pong configuration, after the client establishes a websocket connection with the server, if the two do not send any message for a long time, a read error will occur on the server so that the server will disconnect the connection.
This PR adjusts the websocket ping interval according to the official example of the websocket to fix the bug.