Kestrel doesn't reset KeepAlive timeout on HTTP2 PING frames #24601

davidni · 2020-08-05T22:28:57Z

Describe the bug

NOTE: This is related, but a separate issue from #15104. That one is about server-initiated PING's to the client to proactively detect zombie connections; whereas this one is about Kestrel not resetting its keepalive timeout in response to client-initiated PING's.

gRPC clients can be configured to send periodic pings to keep a channel open for prolonged periods of time even when there aren't any active streams. Kestrel however drops an idle connection after KestrelServerLimits.KeepAliveTimeout elapses. Http2Connection does not reset the TimeoutControl when PING frames are received, resulting in connections that (supposedly) should remain open to be aborted

To Reproduce

Use a gRPC client configured as follows (using gRPC for C#), with the following options:

var options = new List<ChannelOption>()
{
    // ...
    new ChannelOption("grpc.client_idle_timeout_ms", 15 * 60 * 1000),
    new ChannelOption("grpc.keepalive_time_ms", 10 * 1000),
    new ChannelOption("grpc.keepalive_permit_without_calls", 1),
};

This should result in PING frames being sent every 10 seconds, and the channel should remain open up to 15 minutes even when idle. However, Kestrel terminates the connection after its KeepAliveTimeout elapses (2 min by default).

Exceptions

N/A

Further technical details

ASP.NET Core 3.1 (code inspection indicates latest master has the same issue)

Suggested fix would be to add a call to TimeoutControl.SetTimeout(Limits.KeepAliveTimeout.Ticks, TimeoutReason.KeepAlive); when processing a PING frame in Http2Connection.ProcessPingFrameAsync.

Is this a legit bug or is it by design?

@JamesNK @Tratcher

The text was updated successfully, but these errors were encountered:

Tratcher · 2020-08-05T22:54:39Z

This is currently by design, KeepAliveTimeout only tracks request activity, not pings. HttpClient does something similar, they're adding a new flag to give more control over the behavior. If you wanted this in Kestrel we'd need a similar new API.

dotnet/runtime#31198

JamesNK · 2020-08-05T23:16:02Z

Hmmm, I'm not sure about that. The additional flag HttpClient has is to control whether keep alive pings are sent on a HTTP/2 connection without active streams.

PING frames should reset the timeout. In fact I think that is the behavior today.

Tratcher · 2020-08-05T23:23:23Z

I'd expect the server to be more defensive about long lived connections than the client due to resource constraints.

JamesNK · 2020-08-05T23:35:20Z

What are the rules today about Kestrel keeping a HTTP/2 connection active that has no active requests? Does it wait for a configured period before closing it, or does it wait for the client to close it?

Other HTTP/2 servers are quite strict by default when it comes to accepting incoming PINGS. Other servers, by default, don't allow pings if there are no active streams on the connection.

JamesNK · 2020-08-05T23:38:28Z

I was thinking a few days ago that keep alive pings really should be tested end-to-end. This issue is a good prompt. Really need to test:

Grpc.Core client + Grpc.Core server
HttpClient + Grpc.Core server
Grpc.Core client + Kestrel
HttpClient + Kestrel

And for each of them test what happens with active streams, and without active streams.

Tratcher · 2020-08-05T23:38:40Z

I think they're the same as HTTP/1, only active requests keep the connection alive. Otherwise it waits for the KeepAliveTimeout and closes it.

JamesNK · 2020-08-05T23:51:34Z

Ok. I'm trying to think of whether this is a flaw in the keep alive ping feature. I didn't realize KeepAliveTimeout didn't reset on ping frames, and none of my tests went over 2 minutes. One of the scenarios for keep alive pings is to keep a connection warm, despite there being no active requests.

Option 1:

Recommend customers with servers that want long-lasting warm connections to set KeepAliveTimeout = TimeSpan.MaxValue and configure a server keep-alive delay + timeout. That allows a server to be pro-active about deciding whether to close a HTTP/2 connection. However that wouldn't work well if a server was serving both HTTP/1.1 and HTTP/2.

Option 2:

Http2Limits.KeepAlivePingDelay + Http2Limits.KeepAlivePingTimeout overrides KeepAliveTimeout for HTTP/2 (I really wish this was called something like IdleTimeout 😄).

Similar to option 1, but allows the server to be more specific about HTTP/2 behavior.

Option 3:

A setting to control whether pings reset the KeepAliveTimeout timer if there are no active requests. To be defensive it should default to false. Something like:

public class Http2Limits
{
    public bool AllowKeepAlivePingWithoutRequests { get; set; } = false;
    
    // or
    public bool KeepAlivePingsResetIdleTimer { get; set; } = false;
}

If it is false then Kestrel won't throw an error if it gets a ping but it won't reset KeepAliveTimeout timer. This matches current behavior.
If it is true then Kestrel resets KeepAliveTimeout timer.

davidni · 2020-08-06T01:19:44Z

One of the scenarios for keep alive pings is to keep a connection warm, despite there being no active requests.

This is precisely the scenario that led me to identifying this issue 😄

I really wish this was called something like IdleTimeout

Ditto. Something like Go's http server options perhaps.

Option 1 is a non-starter, we cannot allow infinite duration HTTP/1.1 connections, as you mentioned
Option 2 if I understand it correctly, this does not address the issue, and client ping's are still ignored for the purposes of managing server timeouts, right?
Option 3: KeepAlivePingsResetIdleTimer takes care of the issue at hand. Not sure why we would also need AllowKeepAlivePingWithoutRequests?

@JamesNK

JamesNK · 2020-08-06T01:47:22Z

Option 2 if I understand it correctly, this does not address the issue, and client ping's are still ignored for the purposes of managing server timeouts, right?

KeepAlivePingTimeout does use PING to reset its timer. Basically option 2 would turn off the passive KeepAliveTimeout for HTTP/2 and replace it with the active KeepAlivePingDelay/KeepAlivePingTimeout.

Option 3: KeepAlivePingsResetIdleTimer takes care of the issue at hand. Not sure why we would also need AllowKeepAlivePingWithoutRequests?

There is an OR there. They're alternative names for a property that does the same thing. I think I prefer KeepAlivePingsResetIdleTimer.

I think I like option 3 the most. Option 2 feels auto-magical and hard to describe, while option 3 is a distinct setting to enable an easy to understand behavior: HTTP/2 connections won't time out if they're pinged.

davidni · 2020-08-06T03:40:28Z

Gotcha, so Option 3 really is a strict superset of Option 2 (i.e. it can do everything that Option 2 can, but additionally supports the use case where clients are allowed to extend the lifetime of a connection with client-initiated PING's regardless of server-initiated PING's). I too prefer Option 3.

Side note: It may be interesting to consider making these settable per connection (as opposed to globally in Kestrel limits), though I don't have a concrete use case for that at the moment.

JamesNK · 2020-08-07T03:26:18Z

Thoughts? @halter73 @Tratcher @davidfowl

halter73 · 2020-08-07T04:18:27Z

Nice catch @davidni. Pings should reset the keep-alive timeout period. No need for a setting.

With HTTP/1.1, there's no alternative to the client sending a new request to reset the keep-alive timeout. If there was a way for the client to ping an HTTP/1.1 connection, said ping would reset the timeout.

Tratcher · 2020-08-07T04:22:30Z

@halter73 won't that allow a client to keep a connection open indefinitely with no requests? That seems like something the server should have control over.

halter73 · 2020-08-07T04:30:04Z

Absolutely. And like I said, if a client could do so for an HTTP/1.1 connection, we would allow it.

Option 1 of telling people to set an effectively infinite keep-alive timeout is a nonstarter, because that allows a client to just disappear (never send a FIN or RST) and Kestrel would keep the connection alive forever. This would be a big problem for both HTTP/1.1 and HTTP/2. The OS nor anything else would do anything to tell Kestrel the socket is closed (at least not until ~30,000 more years when the TimeSpan.MaxValue finally elapses). TCP keep-alive is a rarely used feature, and I've never seen or heard of it used for any sort of HTTP connection.

The point of the KeepAliveTimout is to avoid exactly that scenario where Kestrel doesn't know that the socket is really gone. HTTP/2 pings tell us the socket is still active, and that's good enough. If the concern is that a malicious client could use a ping to keep the connection alive for cheap, that could already happen with a short GET request.

JamesNK · 2020-08-07T04:36:40Z

There are two kinds of pings to consider:

The server has a keep alive ping configured. When there is no activity the server is responsible for starting the ping, it then gets a PING ACK frame (which could be made to contain a payload to validate it).
We could make this ack ping reset the KeepAliveTimeout timer by default safely because a server has to opt-in to enabling keep alive pings, and it is the one initiating the ping. However this behavior would mean that if you wanted a server to have warm connections then the server would be the one that needs keep alive pings enabled.
Client initiated ping. The server gets the ping, then ACKs it.
I worry about these pings automatically resetting the KeepAliveTimeout timer because now there is no way to configure Kestrel to close idle connections. Although @halter73 brings up a good point that you can get around this anyway by using a GET request instead of a ping.

JamesNK · 2020-08-07T04:45:09Z

Actually relying on keep alive PING ACKs wouldn't work (1). The keep alive ping delay is based on receiving any frames. A server initiated keep alive ping wouldn't be sent if the server was receiving pings, but those pings wouldn't reset KeepAliveTimeout.

halter73 · 2020-08-07T04:46:42Z

I want to avoid adding too many Http2Limits. It's already quite a lot for the majority of people who aren't intimately familiar with the HTTP/2 protocol. Before adding yet another configurable limit, I'd like to know of a real customer/scenario that needs to make this configurable.

It's clear we need to fix pings not resetting the keep-alive timeout. That's just a bug. Really, any frame received while there are not active requests should reset the timeout.

Tratcher · 2020-08-07T04:47:46Z

that could already happen with a short GET request.

Fair. I think 2 makes more sense. It's the client that wants the connection to stay warm so let them initiate the pings.

Tratcher added area-servers feature-yarp This issue is related to work on yarp feature-kestrel labels Aug 5, 2020

Tratcher added the enhancement This issue represents an ask for new feature or an enhancement to an existing one label Aug 5, 2020

JamesNK self-assigned this Aug 5, 2020

JamesNK added this to the 5.0.0-rc1 milestone Aug 7, 2020

JamesNK mentioned this issue Aug 7, 2020

Reset KeepAliveTimeout on HTTP/2 ping #24644

Merged

JamesNK closed this as completed in #24644 Aug 11, 2020

JamesNK mentioned this issue Aug 13, 2020

3.1: Reset KeepAliveTimeout on HTTP/2 ping #24858

Merged

ghost locked as resolved and limited conversation to collaborators Sep 10, 2020

amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kestrel doesn't reset KeepAlive timeout on HTTP2 PING frames #24601

Kestrel doesn't reset KeepAlive timeout on HTTP2 PING frames #24601

davidni commented Aug 5, 2020

Tratcher commented Aug 5, 2020

JamesNK commented Aug 5, 2020

Tratcher commented Aug 5, 2020

JamesNK commented Aug 5, 2020 •

edited

Loading

JamesNK commented Aug 5, 2020 •

edited

Loading

Tratcher commented Aug 5, 2020 •

edited

Loading

JamesNK commented Aug 5, 2020 •

edited

Loading

davidni commented Aug 6, 2020 •

edited

Loading

JamesNK commented Aug 6, 2020 •

edited

Loading

davidni commented Aug 6, 2020

JamesNK commented Aug 7, 2020

halter73 commented Aug 7, 2020

Tratcher commented Aug 7, 2020

halter73 commented Aug 7, 2020 •

edited

Loading

JamesNK commented Aug 7, 2020

JamesNK commented Aug 7, 2020

halter73 commented Aug 7, 2020 •

edited

Loading

Tratcher commented Aug 7, 2020

Kestrel doesn't reset KeepAlive timeout on HTTP2 PING frames #24601

Kestrel doesn't reset KeepAlive timeout on HTTP2 PING frames #24601

Comments

davidni commented Aug 5, 2020

Describe the bug

To Reproduce

Exceptions

Further technical details

Tratcher commented Aug 5, 2020

JamesNK commented Aug 5, 2020

Tratcher commented Aug 5, 2020

JamesNK commented Aug 5, 2020 • edited Loading

JamesNK commented Aug 5, 2020 • edited Loading

Tratcher commented Aug 5, 2020 • edited Loading

JamesNK commented Aug 5, 2020 • edited Loading

davidni commented Aug 6, 2020 • edited Loading

JamesNK commented Aug 6, 2020 • edited Loading

davidni commented Aug 6, 2020

JamesNK commented Aug 7, 2020

halter73 commented Aug 7, 2020

Tratcher commented Aug 7, 2020

halter73 commented Aug 7, 2020 • edited Loading

JamesNK commented Aug 7, 2020

JamesNK commented Aug 7, 2020

halter73 commented Aug 7, 2020 • edited Loading

Tratcher commented Aug 7, 2020

JamesNK commented Aug 5, 2020 •

edited

Loading

JamesNK commented Aug 5, 2020 •

edited

Loading

Tratcher commented Aug 5, 2020 •

edited

Loading

JamesNK commented Aug 5, 2020 •

edited

Loading

davidni commented Aug 6, 2020 •

edited

Loading

JamesNK commented Aug 6, 2020 •

edited

Loading

halter73 commented Aug 7, 2020 •

edited

Loading

halter73 commented Aug 7, 2020 •

edited

Loading