You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): default backend 404, sync, secrets, ingress, tls
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Cloud provider or hardware configuration: 9 worker node cluster
OS (e.g. from /etc/os-release): Ubuntu 16.0.3
Kernel (e.g. uname -a):
Install tools: Ansible, Helm
Others:
What happened:
We are randomly getting "default backend 404" depending on which node (out of 9) handles the request, and what point in time. Those occurrences seem much reduced if we give time for a given deployment to "settle in". The nodes responding with 404 typically show in ingress log that "adding secret ... to the local store" was much delayed compared to other nodes for the same deployment.
No 404 consistently across all the worker nodes, once the application starts responding through at least one node (or within a few seconds)
How to reproduce it (as minimally and precisely as possible):
Deploy ingresses with TLS secrets, observe the time the secret is added to the local store for each ingress controller pod.
See below one example of timing where one node is behind by 40min and another by 90min:
ngress-nginx-ingress-controller-4rrlj.log
219612:I0209 03:14:53.185798 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-7pl55.log
218480:I0209 03:10:54.994316 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-6j5pp.log
227555:I0209 03:51:59.353796 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-bbgr2.log
234364:I0209 04:39:41.330655 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-q27kj.log
219909:I0209 03:08:15.045453 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-clt27.log
231931:I0209 03:05:43.346984 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-pcwf9.log
262087:I0209 03:06:12.650154 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-vlvkd.log
204711:I0209 03:13:46.820651 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
ingress-nginx-ingress-controller-vzg8r.log
217407:I0209 03:01:32.340037 7 backend_ssl.go:68] adding secret mynamespace/mytlssecret to the local store
Anything else we need to know:
The text was updated successfully, but these errors were encountered:
@aledbf thank you much for your quick response. I could see 2 test runs going ok without 404, and each node showed "adding secret ... to the local store" within 5s of each other. More testing to come but this looks good, thanks again.
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): potential bug
What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): default backend 404, sync, secrets, ingress, tls
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
NGINX Ingress controller version:
v0.10.2
Kubernetes version (use
kubectl version
):Environment:
uname -a
):What happened:
We are randomly getting "default backend 404" depending on which node (out of 9) handles the request, and what point in time. Those occurrences seem much reduced if we give time for a given deployment to "settle in". The nodes responding with 404 typically show in ingress log that "adding secret ... to the local store" was much delayed compared to other nodes for the same deployment.
While trying to troubleshoot this we came accross this line which seems suspicious:
https://github.com/kubernetes/ingress-nginx/blob/nginx-0.10.2/internal/ingress/controller/store/backend_ssl.go#L199
This used to be a "continue" instead of "return", and it is seems odd that this should give up on all remaining ingresses.
What you expected to happen:
No 404 consistently across all the worker nodes, once the application starts responding through at least one node (or within a few seconds)
How to reproduce it (as minimally and precisely as possible):
Deploy ingresses with TLS secrets, observe the time the secret is added to the local store for each ingress controller pod.
See below one example of timing where one node is behind by 40min and another by 90min:
Anything else we need to know:
The text was updated successfully, but these errors were encountered: