-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: support TCP keepalive for downstream sockets #3634
Comments
@ggreenway I had thought we already implemented this for listeners, but it looks like only upstream clusters right now. Is that right? |
It's on my TODO list to implement for listeners. Someone else implemented it for upstreams already. The needed work is to create the config, and then pipe it through to the existing code for setting the correct sockopt. |
Just FYI, we hard-coded some setsockopt() calls into our envoy fork, and it solved the NLB connection reset problem. |
Does the feature included in the version v1.7.0? |
@lwhile To my knowledge this hasn't been implemented yet. However, generic socket option support was added. On a Listener, you can set socket_options. You'll have to figure out the kernel socket option numbers and exact values, but I think it will work. See https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/lds.proto. |
@ggreenway are the socket options set only on listening socket or also on accept'ed ones? Cause what's needed here is the latter |
I don't know off the top of my head. |
We are bitten by this. |
I think we ran into this too while running Istio on AKS: |
Setting socket option for listener actually works. Assuming unix-like environment, configuration looks like follows. listeners {
socket_options {
# SOL_SOCKET = 1
# SO_KEEPALIVE = 9
level: 1
name: 9
int_value: 1
state: STATE_LISTENING
}
# IPPROTO_TCP = 6
# TCP_KEEPIDLE = 4
# TCP_KEEPINTVL = 5
socket_options {
level: 6
name: 4
int_value: 60
state: STATE_LISTENING
}
socket_options {
level: 6
name: 5
int_value: 60
state: STATE_LISTENING
}
} For an example setting listener in as a static resource: static_resources:
listeners:
- name: listener_0
socket_options:
- level: 1
name: 9
int_value: 1
state: STATE_LISTENING
# SOL_SOCKET = 1
# SO_KEEPALIVE = 9
- level: 1
name: 9
int_value: 1
state: STATE_LISTENING
# IPPROTO_TCP = 6
# TCP_KEEPIDLE = 4
# TCP_KEEPINTVL = 5
- level: 6
name: 4
int_value: 60
state: STATE_LISTENING
- level: 6
name: 5
int_value: 60
state: STATE_LISTENING
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 10000
... |
This is possible using socket options as described. |
Some LoadBalancers do not set KEEPALIVE when they open a TCP connection to the Istio Ingress Gateway. For long living connections it can cause silent timeouts. Therefore envoy must be configured to send KEEPALIVE to downstream (LB). See envoyproxy/envoy#3634
Some LoadBalancers do not set KEEPALIVE when they open a TCP connection to the Istio Ingress Gateway. For long living connections it can cause silent timeouts. Therefore envoy must be configured to send KEEPALIVE to downstream (LB). See envoyproxy/envoy#3634
We use AWS NLB in front of envoy, and it has a 350s idle timeout.
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout
If the client makes a HTTP/1.1 keep-alive connection to that NLB, sends a request and then sends another one > 350s later, the client will get a connection reset.
We expose that to the internet, so we don't control the clients. However, it looks like envoy's listener could set SO_KEEPALIVE on accepted sockets, so that we can then tune the kernel to send those keepalives more frequently than 350 seconds, and the connection will keep working even if it's idle.
Edit: looks like on linux you can set keepalive parameters like TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL, in which case you won't need to alter the kernel settings, if envoy supports configuring these.
The text was updated successfully, but these errors were encountered: