-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graphql, node, rpc: improve HTTP write timeout handling #25457
Conversation
I implemented a different approach we roughly discussed with @fjl: the rpc handler will run the method in a goroutine now. It waits for that to finish OR the context deadline to expire, in which case returns an error. The method will be running in background until it stops (possible leak). Therefore it's important that long-running methods respect context deadline. I made sure this happens in The timeout is being set in the outer layer ( Note: The error type I added feels awkward between official json-rpc errors. |
node/rpcstack.go
Outdated
@@ -198,6 +198,9 @@ func (h *httpServer) ServeHTTP(w http.ResponseWriter, r *http.Request) { | |||
// if http-rpc is enabled, try to serve request | |||
rpc := h.httpHandler.Load().(*rpcHandler) | |||
if rpc != nil { | |||
ctx, cancel := context.WithTimeout(r.Context(), h.timeouts.ReadTimeout-(50*time.Millisecond)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this here. 50 ms is totally a random number. I think lower numbers like 20ms should also be fine. Important thing is the timeout hits before the stdlib's http library kills the connection.
This seems to work well. With the test-method like this: diff --git a/internal/ethapi/api.go b/internal/ethapi/api.go
index 90322033b9..b90ac5048d 100644
--- a/internal/ethapi/api.go
+++ b/internal/ethapi/api.go
@@ -2035,3 +2035,10 @@ func toHexSlice(b [][]byte) []string {
}
return r
}
+
+func (api *DebugAPI) TimeMeOut(ctx context.Context, seconds uint64) (string, error) {
+ log.Info("TimeMeOut sleeping...", "seconds", seconds)
+ time.Sleep(time.Second * time.Duration(seconds))
+ log.Info("TimeMeOut waking up!")
+ return "Oll korrekt!", nil
+}
The
And the timeout triggers after
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I don't like about this PR, is that the RPC handler will keep running in case of timeout, because it executes the RPC call on a background goroutine now.
It would be nicer to keep this out and rely on the individual RPC handlers to exit promptly when the context is canceled.
rpc/handler.go
Outdated
@@ -334,8 +334,19 @@ func (h *handler) handleCall(cp *callProc, msg *jsonrpcMessage) *jsonrpcMessage | |||
return msg.errorResponse(&invalidParamsError{err.Error()}) | |||
} | |||
start := time.Now() | |||
answer := h.runMethod(cp.ctx, msg, callb, args) | |||
answerCh := make(chan *jsonrpcMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This channel needs to be buffered, otherwise the handler goroutine will leak when there is a timeout.
I agree about this. If you have a DoS-RPC request that you use to crash some remote node (or infura), then this feature right here would make it even easier to cause DoS. You could just sequentially fire and forget, and get a multihreaded impact. |
TODO Triage discussion: worth doing this or is it dead in the water, due to the inherent limitations of this apprach? |
Alternative approach: When handling RPC call, launch a timer using responseSlot := make(chan struct{}, 1)
responseSlot <- struct{}{}
responded := make(struct{})
timeoutResponseTimer := time.AfterFunc(timeout, func() {
select {
case <-responseSlot:
// The timeout occurred and the method has not responded yet.
// send the timeout error response
close(responded)
case <-responded:
// The method responded.
}
})
response := runMethod()
timeoutResponseTimer.Stop()
select {
case <-responseSlot:
// send the response
close(responded)
case <-responded:
// timeout error response was already sent
} |
This reverts commit 2970cb0c957fa31c6bb3687663652b477495c7bf.
Actually, it can also be done with less channels using var respondOnce sync.Once
timeoutResponseTimer := time.AfterFunc(timeout, func() {
respondOnce.Do(func () {
// send timeout error
})
})
response := runMethod()
timeoutResponseTimer.Stop()
respondOnce.Do(func () {
// send response
}) |
This reverts commit 3feba50.
I have decided to remove changes in eth/filters and eth/tracers from the PR. We can submit them in a separate change. |
RPC method handler changes resubmitted in #26320 |
Here we add special handling for sending an error response when the write timeout of the HTTP server is just about to expire. This is surprisingly difficult to get right, since is must be ensured that all output is fully flushed in time, which needs support from multiple levels of the RPC handler stack: The timeout response can't use chunked transfer-encoding because there is no way to write the final terminating chunk. net/http writes it when the topmost handler returns, but the timeout will already be over by the time that happens. We decided to disable chunked encoding by setting content-length explicitly. Gzip compression must also be disabled for timeout responses because we don't know the true content-length before compressing all output, i.e. compression would reintroduce chunked transfer-encoding.
A work-around for golang/go#47229. Trigger timeout a bit before stdlib closes the connection so users see a proper error message. Note I still kept the WriteTimeout of the http.Server.
I'm not sure what happens if the method handler and the timeout statement both write at the same time
Fixes #21430