Nginx Controller using endpoints instead of Services #257

Nalum · 2017-02-10T10:19:22Z

Is there any reason as to why the Nginx controller is set up to get the endpoints that a Service uses rather than using the Service?

When I run updates to a deployment and they get rolled out some requests are being sent to the pods that are being terminated which results in a 5XX error because nginx still thinks it's available. Doesn't the use of the Service take care of this issue?

I'm using this image: gcr.io/google_containers/nginx-ingress-controller:0.8.3

I'd be happy to look into changing this so it works with the Services rather than the Endpoints if it makes sense to everyone that it does that.

The text was updated successfully, but these errors were encountered:

yissachar · 2017-02-10T16:56:41Z

The old docs explained:

The NGINX ingress controller does not uses Services to route traffic to the pods. Instead it uses the Endpoints API in order to bypass kube-proxy to allow NGINX features like session affinity and custom load balancing algorithms. It also removes some overhead, such as conntrack entries for iptables DNAT.

Nalum · 2017-02-10T17:32:12Z

Ah okay.

Thanks for that, I didn't find it in my searches.

Are there efforts being made to have it more responsive to the readiness and liveness checks do you know?

aledbf · 2017-02-10T17:34:24Z

Are there efforts being made to have it more responsive to the readiness and liveness checks do you know?

@Nalum yes. We need to release a final version before changing the current behavior.
The idea (we need a POC first) is to use the kong load balancer code that allows the update of the nginx upstreams without requiring a reload.

Nalum · 2017-02-13T12:05:38Z

@aledbf anything I can do to help with this?

Nalum · 2017-02-22T12:10:18Z

Just dropping these comments here:

aledbf Slack 11:59 AM 2017-02-22
@Nalum about that, I just need time 🙂 . We need to release 0.9 first. I hope after beta 2 #303 we can release 0.9. After that I will start looking the kong load balancer lua code. Basically this is the code we need to understand and “port” Kong/kong#1735

aledbf Slack 12:00 PM 2017-02-22
@Nalum please take this just as an idea and POC. We really need to found a way to avoid reloads and better handling of nginx upstreams

rikatz · 2017-06-07T17:02:37Z

OK, I was taking a look at this:

https://github.com/yaoweibin/nginx_upstream_check_module

Maybe we can open a feature request to configure this, but I think this is not the case of this issue.

About using services instead of PODs, we also have to take in mind that serviceIPs are valid only inside a Kubernetes Cluster, while POD IPs might be valid in your network (using Calico or some other solution).

Anyway, ingress gets the upstream IP from Service (connect to service, watch the service and check for POD IP changes in that service) so it might be the case to health check the Upstream POD with the module above, instead of using the Service IP :)

@aledbf Is this the case to open a feature request using the upstream healthcheck module or should we keep this open?

Thanks!

Nalum · 2017-06-07T17:21:51Z

@rikatz correct me if I'm wrong, the brief look I took at that repo didn't show that it is using the kubernetes health checks, I would think it better to take advantage of those rather than adding additional checks. So have nginx be more responsive to the changes of the service.

aledbf · 2017-06-07T17:58:23Z

@rikatz adding an external module for this just duplicates the probes feature already provided by kubernetes.

aledbf · 2017-06-07T17:59:19Z

@rikatz what this issue is really about how we update the configuration (change in pods) without reloading nginx

rikatz · 2017-06-07T18:29:32Z

Yeap, Kube health check is the best approach. I was just thinking about an alternative :)

Will take a look in how to change the config without reload the nginx also.

rikatz · 2017-06-09T12:50:19Z

@aledbf and about this module: https://github.com/yzprofile/ngx_http_dyups_module

And ingress adding / removing upstreams based in HTTP request for this? This is the same approach that NGINX Plus uses to add/remove upstreams without reload / kill nginx process

aledbf · 2017-06-09T14:04:53Z

@rikatz I tried that module like a year ago (without success). Maybe we need to check again

rikatz · 2017-06-12T14:00:24Z

@aledbf I did some quick checks here and it worked 'fine'.

But the following situations still 'reloads' the nginx:

New vHost
New Path (changed Ingress)
New SSL Cert

I couldn't verify if, for each upstream change, NGINX got reloaded or if this doesn't happens.

So, is this module of upstreams still applicable to the whole problem? I can start writing a PoC of this approach (it includes changing the nginx.tmpl, the core and a lot of other things) but if you think it's still a valuable change, we can do this :D

Thanks

arikkfir · 2017-09-27T14:46:27Z

Hi @chrismoos,
Would it be possible to also support this annotation in the Nginx configmap as well? We have quite a few Ingress resources and would love setting it in a centralized place instead of in each one... ?

aledbf · 2017-10-08T18:35:46Z

Closing. It is possible to choose between endpoints or services using a flag.

mbugeia · 2017-10-11T08:20:04Z

@aledbf Hi, can you tell me which flag should I use to use service-upstream by default ? I can't find it in the documentation or with -h.

jordanjennings · 2017-10-11T14:47:08Z

@mbugeia It's ingress.kubernetes.io/service-upstream and you can find that on this page: https://github.com/kubernetes/ingress-nginx/blob/master/configuration.md#service-upstream

montanaflynn · 2017-10-26T22:25:22Z

@jordanjennings that link is broken now, here's a working link:

https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/annotations.md#service-upstream

jesseshieh · 2017-11-05T17:57:41Z

@Nalum @aledbf sorry to comment on a closed issue, but I was wondering what the status of using "the kong load balancer code that allows the update of the nginx upstreams without requiring a reload" is. Was there any progress made on that at all?

aledbf · 2017-11-05T18:01:00Z

Was there any progress made on that at all?

No, sorry. One of the reason is that adding lua again implies that will work only in some platforms

rikatz · 2017-11-05T18:38:19Z

It might be a stupid question of mine, but are we 'HUP'ing the process? (http://nginx.org/en/docs/control.html)

I've seen also that we could send a 'USR2' signal (https://www.digitalocean.com/community/tutorials/how-to-upgrade-nginx-in-place-without-dropping-client-connections) but this might overload NGINX with stale connections, right?

aledbf · 2017-11-05T18:58:33Z

@rikatz the USR2 procedure create several new problems:

doubles the resource usage
hard to coordinate
if you have clients connections with keepalive you loose request
(I already tried this approach two years ago)

mcfedr · 2018-04-02T15:03:42Z

This is likely improved by using #2174

jwatte · 2018-10-17T17:44:08Z

@montanaflynn that link is 404, here's the now-working link: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md#service-upstream
(and that's very likely to rot, too -- google for nginx annotation service-upstream to find it whenever you next need this)

michaelajr · 2019-03-08T15:44:47Z

When I use the annotation I still see all the Pod IPs as endpoints. Shouldn't I see the Service's ClusterIp?

I see this:

Host  Path  Backends
  ----  ----  --------
  *     *     testapp:443 (192.168.103.2:443,192.168.203.66:443,192.168.5.3:443)

doubleyewdee · 2019-03-18T23:25:49Z

@michaelajr make sure you're using nginx.ingress.kubernetes.io/service-upstream -- the shorter form no longer works for nginx-specific ingress settings by default.

robbie-demuth · 2019-11-14T16:03:10Z

Does anyone know why the docs indicate that sticky sessions won't work if the service-upstream annotation is used? Wouldn't the session affinity configuration on the Kubernetes Service itself be honored in that case?

EDIT

It looks like this is due to the fact that Kubernetes session affinity is based on client IPs. By default, NGINX ingress obscures client IPs because the Service it creates sets externalTrafficPolicy to Cluster. It looks like setting it to Local instead might resolve the issue, but I'm not quite sure I understand what the implications are - particularly regarding imbalanced traffic spreading

https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip

zhangyuxuan1992 · 2020-06-05T08:21:16Z

@michaelajr make sure you're using nginx.ingress.kubernetes.io/service-upstream -- the shorter form no longer works for nginx-specific ingress settings by default.

I too use
nginx.ingress.kubernetes.io/service-upstream: 'true'
but till see all pod and ports instead of ClusterIp.

/*.* *:8423 (192.168.102.25:8423,192.168.149.225:8423)

VengefulAncient · 2020-09-21T15:48:47Z

I too use
nginx.ingress.kubernetes.io/service-upstream: 'true'
but till see all pod and ports instead of ClusterIp.

/*.* *:8423 (192.168.102.25:8423,192.168.149.225:8423)

Same here and I would really like to understand whether the directive is actually working. Right now it seems to me that it isn't.

cayla · 2022-05-23T20:35:21Z

For anyone finding this page years later, you can confirm the change is working by looking at the ingress-nginx logs for traffic to the ingress in question. By default, the logging will show the backend IP that handled the request. It's either an endpoint IP or service IP depending on the configuration.

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/log-format/

Instead of having NGINX maintain it's own list of upstreams, we want it instead to use the service ClusterIP such that k8s can handle e.g. cases where pods are evicted. Without this we end up with 5xx errors e.g. on deploys. See kubernetes/ingress-nginx#257 for more details.

narqo · 2022-11-22T10:40:21Z

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

longwuyuan · 2022-11-22T13:59:56Z

@narqo thanks for the udpate

snigdhasambitak · 2022-12-01T12:29:55Z

@narqo did you get it working with service-upstream = true? Could you please specify the version of nginx ingress you are using and your k8s cluster details? It seems to be not working in 1.22

narqo · 2022-12-01T18:52:35Z

did you get it working with service-upstream = true?

Yes it worked in our tests (AWS EKS 1.21, ingress-nginx v1.3.1, deployed and configured via Helm). We confirmed that the change (and the rollback of the change) took the effect by observing the changes in nginx controller's access logs, where we log the details of the request's upstream.

sherifabdlnaby · 2022-12-26T18:20:25Z

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Interesting, would using Toplogy Aware Hints with Toplogy Spread Constraint (for Nginx pods as well as deployment) help ?

taliastocks · 2023-01-20T16:26:05Z

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change? Were you able to adjust any nginx settings, like proxy_next_upstream to avoid failed requests during a service upgrade?

I've been playing around with these myself with no luck (the service backend is gRPC, so we would lose service call load balancing if we enabled service-upstream=True, similar to what you show in your chart).

narqo · 2023-01-20T20:54:49Z

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change?

This is exactly the issue we were looking at, when we experimented with service-upstream=true. The preStop hook for every deployment (initially raised in #7330) isn't a scalable solution, for dynamic production with hundreds of µ-services behind the ingress. We are looking for better alternatives.

taliastocks · 2023-01-20T21:17:55Z

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change?

This is exactly the issue we were looking at, when we experimented with service-upstream=true. The preStop hook for every deployment (initially raised in #7330) isn't a scalable solution, for dynamic production with hundreds of µ-services behind the ingress. We are looking for better alternatives.

Honestly, I would even be happy with like a 5-10 second preStop hook, but even 30 seconds preStop delay + 30 seconds connection draining was not consistently long enough to prevent seeing DEADLINE_EXCEEDED.

The next thing I'm planning on trying is setting nginx.ingress.kubernetes.io/proxy-connect-timeout: "1" and relying on proxy_next_upstream behavior. It's not ideal because it would still add a second latency to some requests, but that's not a complete deal-breaker for our use-case.

taliastocks · 2023-01-24T19:33:53Z

Honestly, I would even be happy with like a 5-10 second preStop hook, but even 30 seconds preStop delay + 30 seconds connection draining was not consistently long enough to prevent seeing DEADLINE_EXCEEDED.

The next thing I'm planning on trying is setting nginx.ingress.kubernetes.io/proxy-connect-timeout: "1" and relying on proxy_next_upstream behavior. It's not ideal because it would still add a second latency to some requests, but that's not a complete deal-breaker for our use-case.

Welp, turns out the actual problem I was running into was related to new pods coming up slowly rather than old pods terminating. Apologies for misidentifying the issue!

dcodix · 2023-12-02T16:02:17Z

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

We observed the same behaviour.
This seems to happen because nginx uses "long lived" (keepalive) connections, so it does just 1 connection to the service IP, which DNATs to 1 pod IP, and then all the requests go thru it up to a maximum number. That number seems to be defined by upstream-keepalive-requests which I believe defaults to 10k requests, so, only after 10k requests a new TCP connection will be established. This lowers the number of TCP connections by x/10k so, if you have 1k requests per minute, you will not establish a new connection for 10 minutes (unless it idles). kube-proxy (iptables) does a good job at simulation roundrobin, but only with a big enough volume. Like, if you actually do 20k TCP connections via iptables, it will most likely look balanced, but if instead you do 2 chances are that it will not balance.
We were thinking if it was worth leaving service-upstream: true but lowering upstream-keepalive-requests so more TCP connections are made, but so far we did not because we did not have time to test the impact of this change in performance and resources in general.

* flyteadmin http port * flyteadmin grpc port * flyteconsole grpc port This is necessary because the ingress may be configured in a way that it sends TLS traffic to internal Flyte services. Istio will use port names to determine traffic - and may therefore assume the appProtocol of http, even though traffic from ingress -> flyteadmin is actually https. This misconfiguration prevents any traffic from flowing through the ingress to the service. Flyteadmin http and grcp ports *are* accessible using `http` and `grpc` values for appProtocol respectively within the cluster, but as soon as traffic travels between the ingress and the service those settings will not work. The most "compatible" setting is `tcp` which works for any network stream. - Adds the nginx.ingress.kubernetes.io/service-upstream: "true" Nginx Controller using endpoints instead of Services kubernetes/ingress-nginx#257 kubernetes/ingress-nginx@main/docs/user-guide/nginx-configuration/annotations.md#service-upstream Signed-off-by: noahjax <noah.jackson@dominodatalab.com> Signed-off-by: ddl-ebrown <ethan.brown@dominodatalab.com>

aledbf mentioned this issue Mar 24, 2017

[nginx] Lost client requests when updating a deployment and using keep alive #489

Closed

chrismoos mentioned this issue Jul 16, 2017

Add annotation to allow use of service ClusterIP for NGINX upstream. #981

Merged

aledbf closed this as completed Oct 8, 2017

ghost mentioned this issue Mar 5, 2018

Turning off keepalive does not work as documented #2168

Closed

azman0101 mentioned this issue Nov 2, 2020

Zero downtime upgrade without race conditions #6105

Closed

arianvp mentioned this issue Dec 3, 2020

charts/sftd: introduce wireapp/wire-server-deploy#382

Merged

hazzadous mentioned this issue Aug 17, 2022

fix(nginx): set service-upstream to "true" PostHog/charts-clickhouse#532

Merged

6 tasks

longwuyuan mentioned this issue Sep 4, 2024

service-upstream sending traffic to unhealthy/terminating pods #8973

Closed

ddl-ebrown mentioned this issue Oct 8, 2024

Add appProtocol to agent service to allow agent to work with istio flyteorg/flyte#5240

Merged

3 tasks

Nginx Controller using endpoints instead of Services #257

Nginx Controller using endpoints instead of Services #257

Comments

Nalum commented Feb 10, 2017 • edited Loading

yissachar commented Feb 10, 2017

Nalum commented Feb 10, 2017

aledbf commented Feb 10, 2017

Nalum commented Feb 13, 2017

Nalum commented Feb 22, 2017

rikatz commented Jun 7, 2017

Nalum commented Jun 7, 2017

aledbf commented Jun 7, 2017

aledbf commented Jun 7, 2017

rikatz commented Jun 7, 2017

rikatz commented Jun 9, 2017

aledbf commented Jun 9, 2017

rikatz commented Jun 12, 2017

arikkfir commented Sep 27, 2017

aledbf commented Oct 8, 2017

mbugeia commented Oct 11, 2017

jordanjennings commented Oct 11, 2017

montanaflynn commented Oct 26, 2017

jesseshieh commented Nov 5, 2017

aledbf commented Nov 5, 2017

rikatz commented Nov 5, 2017

aledbf commented Nov 5, 2017

mcfedr commented Apr 2, 2018

jwatte commented Oct 17, 2018

michaelajr commented Mar 8, 2019 • edited Loading

doubleyewdee commented Mar 18, 2019

robbie-demuth commented Nov 14, 2019 • edited Loading

zhangyuxuan1992 commented Jun 5, 2020 • edited Loading

VengefulAncient commented Sep 21, 2020

cayla commented May 23, 2022

narqo commented Nov 22, 2022

longwuyuan commented Nov 22, 2022

snigdhasambitak commented Dec 1, 2022

narqo commented Dec 1, 2022 • edited Loading

sherifabdlnaby commented Dec 26, 2022

taliastocks commented Jan 20, 2023

narqo commented Jan 20, 2023

taliastocks commented Jan 20, 2023

taliastocks commented Jan 24, 2023

dcodix commented Dec 2, 2023

Nalum commented Feb 10, 2017 •

edited

Loading

michaelajr commented Mar 8, 2019 •

edited

Loading

robbie-demuth commented Nov 14, 2019 •

edited

Loading

zhangyuxuan1992 commented Jun 5, 2020 •

edited

Loading

narqo commented Dec 1, 2022 •

edited

Loading