Issue in installating ambassdor over kubernetes #240

ashish1993 · 2018-01-31T09:31:58Z

Hi,
I am trying to install ambassador over kubernetes.

I am getting the ambassdor pods status as CrashLoopBackOff

logs: http://paste.openstack.org/show/658099/
ambassador container logs: http://paste.openstack.org/show/658122/
steps followed for installation:

https://www.getambassador.io/user-guide/getting-started

kubectl apply -f ambassador-service.yaml

ambassador-service.yaml: http://paste.openstack.org/show/658120/

kubectl apply -f https://getambassador.io/yaml/ambassador/ambassador-rbac.yaml

ERROR says:

Readiness probe failed: Get http://10.241.96.13:8877/ambassador/v0/check_ready: dial tcp 10.241.96.13:8877: getsockopt: connection refused

aroundthecode · 2018-02-02T16:06:39Z

same here, ambassador 0.23 over k8 1.9.2.
Pods started healthy (I was able to reach abassador admin page) then started crashing upon health check random failure

Edit: removing healtchecks from Deployment manifest (not best approach, I know! :) ) solved the random killing problem and the pods seems to be stable in any case

richarddli · 2018-02-05T16:35:03Z

OK, some progress on tracking this done. I'm on Minikube, running Kube 1.9.0.

I bumped the liveness/readiness probe initialDelaySeconds and periodSeconds to 15 seconds (from 3). Then I see this in the log:

[2018-02-05 16:31:44.060][24][info][config] source/server/configuration_impl.cc:110] loading stats sink configuration
2018-02-05 16:31:44 kubewatch 0.22.0 INFO: Configuration /etc/ambassador-config-2-envoy.json valid
2018-02-05 16:31:44 kubewatch 0.22.0 INFO: Moved valid configuration /etc/ambassador-config-2-envoy.json to /etc/envoy-2.json
unable to initialize hot restart: previous envoy process is still initializing
starting hot-restarter with target: /application/start-envoy.sh
forking and execing new child process at epoch 0
forked new child process with PID=12
got SIGHUP
forking and execing new child process at epoch 1
forked new child process with PID=25
got SIGCHLD
PID=25 exited with code=1
Due to abnormal exit, force killing all child processes and exiting
force killing PID=12
exiting due to lack of child processes
AMBASSADOR: envoy exited with status 1
Here's the envoy.json we were trying to run with:

When I drop it back down to 3 seconds, I get the connection refused readiness probe errors.

richarddli · 2018-02-05T17:00:15Z

In my case, it looks like things are failing because Kube DNS is crashing because of kubernetes/minikube#1722.

@aroundthecode @ashish1993 Could you verify that Kube DNS is working properly on your systems? You can do this with a kubectl get pods -n kube-system. If so, can you paste the logs from your Kube DNS pod (kubectl get logs -n kube-system <kube-dns-pod-name> kubedns).

richarddli · 2018-02-05T17:00:52Z

My error log:

I0205 16:52:36.008783       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
I0205 16:52:36.507882       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
E0205 16:52:36.676135       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:kube-system:default" cannot list configmaps in the namespace "kube-system"
E0205 16:52:36.676543       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:default" cannot list endpoints at the cluster scope
E0205 16:52:36.678036       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:kube-system:default" cannot list services at the cluster scope
I0205 16:52:37.007929       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
F0205 16:52:37.507812       1 dns.go:168] Timeout waiting for initialization

  FirstSeen	LastSeen	Count	From			SubObjectPath			Type		Reason		Message
  ---------	--------	-----	----			-------------			--------	------		-------
  2d		24m		386	kubelet, minikube	spec.containers{dnsmasq}	Warning		Unhealthy	Liveness probe failed: HTTP probe failed with statuscode: 503
  2d		19m		1514	kubelet, minikube	spec.containers{kubedns}	Warning		BackOff		Back-off restarting failed container
  2d		14m		518	kubelet, minikube	spec.containers{kubedns}	Warning		Unhealthy	Readiness probe failed: Get http://172.17.0.2:8081/readiness: dial tcp 172.17.0.2:8081: getsockopt: connection refused
  2d		4m		1131	kubelet, minikube	spec.containers{dnsmasq}	Warning		BackOff		Back-off restarting failed container

mbovo · 2018-02-07T13:46:19Z

Hi @richarddli i'm working in tandem with @aroundthecode, these are the logs requested:
We have a kubernetes HA cluster with 3 mastes and 4 minions. Etcd is deployed on each master and the apiserver is served using an external loadbalancer.

NAME                                                       READY     STATUS    RESTARTS   AGE
kube-apiserver-itpv-ddkub-i233.facilitylive.int            1/1       Running   0          5d
kube-apiserver-itpv-ddkub-i234.facilitylive.int            1/1       Running   0          5d
kube-apiserver-itpv-ddkub-i235.facilitylive.int            1/1       Running   0          5d
kube-controller-manager-itpv-ddkub-i233.facilitylive.int   1/1       Running   1          13d
kube-controller-manager-itpv-ddkub-i234.facilitylive.int   1/1       Running   0          5d
kube-controller-manager-itpv-ddkub-i235.facilitylive.int   1/1       Running   0          5d
kube-dns-6f4fd4bdf-2jhsq                                   3/3       Running   0          21h
kube-dns-6f4fd4bdf-nfp2k                                   3/3       Running   0          21h
kube-dns-6f4fd4bdf-v9b2v                                   3/3       Running   0          21h
kube-flannel-ds-2f5db                                      1/1       Running   0          5d
kube-flannel-ds-64kpw                                      1/1       Running   0          5d
kube-flannel-ds-67g27                                      1/1       Running   0          5d
kube-flannel-ds-c6x6v                                      1/1       Running   0          5d
kube-flannel-ds-fgctt                                      1/1       Running   0          5d
kube-flannel-ds-g68pj                                      1/1       Running   0          5d
kube-flannel-ds-z5bjx                                      1/1       Running   0          5d
kube-proxy-594ch                                           1/1       Running   0          5d
kube-proxy-rkfkq                                           1/1       Running   0          5d
kube-proxy-snh25                                           1/1       Running   0          13d
kube-proxy-ww2vn                                           1/1       Running   0          5d
kube-proxy-xkkpv                                           1/1       Running   0          5d
kube-proxy-z42bs                                           1/1       Running   0          5d
kube-proxy-zmbdz                                           1/1       Running   0          5d
kube-scheduler-itpv-ddkub-i233.facilitylive.int            1/1       Running   1          13d
kube-scheduler-itpv-ddkub-i234.facilitylive.int            1/1       Running   0          5d
kube-scheduler-itpv-ddkub-i235.facilitylive.int            1/1       Running   0          5d
kubernetes-dashboard-845747bdd4-242gf                      1/1       Running   0          5d

 kubectl logs -n kube-system kube-dns-6f4fd4bdf-2jhsq kubedns                                                                                                         (develop|✚2…) 
I0206 16:29:04.237731       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:29:04.308779       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:29:04.308932       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:29:04.308971       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:29:04.308996       1 server.go:112] FLAG: --config-map=""
I0206 16:29:04.309009       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:29:04.309028       1 server.go:112] FLAG: --config-period="10s"
I0206 16:29:04.309051       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:29:04.309068       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:29:04.309097       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:29:04.309119       1 server.go:112] FLAG: --federations=""
I0206 16:29:04.309141       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:29:04.309156       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:29:04.309167       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:29:04.309178       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:29:04.309200       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:29:04.309244       1 server.go:112] FLAG: --log-dir=""
I0206 16:29:04.309258       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:29:04.309272       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:29:04.309305       1 server.go:112] FLAG: --nameservers=""
I0206 16:29:04.309363       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:29:04.309388       1 server.go:112] FLAG: --v="2"
I0206 16:29:04.309399       1 server.go:112] FLAG: --version="false"
I0206 16:29:04.309417       1 server.go:112] FLAG: --vmodule=""
I0206 16:29:04.309791       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:29:04.310501       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:29:04.310541       1 dns.go:146] Starting endpointsController
I0206 16:29:04.310553       1 dns.go:149] Starting serviceController
I0206 16:29:04.310822       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.310846       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.810965       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:29:04.811022       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:29:04.811062       1 server.go:133] Setting up cache handler (/cache)
I0206 16:29:04.811081       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.252268       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:27:13.102411       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.

kubectl logs -n kube-system kube-dns-6f4fd4bdf-nfp2k kubedns                                                                                                         (develop|✚2…) 
I0206 16:29:04.381159       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:29:04.385295       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:29:04.385457       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:29:04.385497       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:29:04.385553       1 server.go:112] FLAG: --config-map=""
I0206 16:29:04.385579       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:29:04.385676       1 server.go:112] FLAG: --config-period="10s"
I0206 16:29:04.385741       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:29:04.385783       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:29:04.385818       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:29:04.385841       1 server.go:112] FLAG: --federations=""
I0206 16:29:04.385873       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:29:04.385883       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:29:04.385898       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:29:04.385910       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:29:04.385919       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:29:04.385941       1 server.go:112] FLAG: --log-dir=""
I0206 16:29:04.385952       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:29:04.385967       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:29:04.385977       1 server.go:112] FLAG: --nameservers=""
I0206 16:29:04.385991       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:29:04.386000       1 server.go:112] FLAG: --v="2"
I0206 16:29:04.386014       1 server.go:112] FLAG: --version="false"
I0206 16:29:04.386030       1 server.go:112] FLAG: --vmodule=""
I0206 16:29:04.386288       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:29:04.386995       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:29:04.387055       1 dns.go:146] Starting endpointsController
I0206 16:29:04.387068       1 dns.go:149] Starting serviceController
I0206 16:29:04.387334       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.387391       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.887521       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:29:04.887633       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:29:04.887663       1 server.go:133] Setting up cache handler (/cache)
I0206 16:29:04.887677       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.253949       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:27:27.817457       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.

kubectl logs -n kube-system kube-dns-6f4fd4bdf-v9b2v kubedns                                                                                                         (develop|✚2…) 
I0206 16:28:59.366665       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:28:59.486752       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:28:59.486951       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:28:59.487000       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:28:59.487019       1 server.go:112] FLAG: --config-map=""
I0206 16:28:59.487028       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:28:59.487041       1 server.go:112] FLAG: --config-period="10s"
I0206 16:28:59.487065       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:28:59.487083       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:28:59.487110       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:28:59.487133       1 server.go:112] FLAG: --federations=""
I0206 16:28:59.487154       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:28:59.487172       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:28:59.487189       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:28:59.487209       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:28:59.487223       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:28:59.487242       1 server.go:112] FLAG: --log-dir=""
I0206 16:28:59.487252       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:28:59.487261       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:28:59.487270       1 server.go:112] FLAG: --nameservers=""
I0206 16:28:59.487278       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:28:59.487288       1 server.go:112] FLAG: --v="2"
I0206 16:28:59.487341       1 server.go:112] FLAG: --version="false"
I0206 16:28:59.487364       1 server.go:112] FLAG: --vmodule=""
I0206 16:28:59.487573       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:28:59.488385       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:28:59.488421       1 dns.go:146] Starting endpointsController
I0206 16:28:59.488431       1 dns.go:149] Starting serviceController
I0206 16:28:59.488742       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:28:59.488764       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:28:59.988728       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:28:59.988785       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:28:59.988815       1 server.go:133] Setting up cache handler (/cache)
I0206 16:28:59.988834       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.250538       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:28:50.005463       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.

Seems kubedns it's working but ambassador pods still failing with Readiness probe failed: Get http://10.244.5.35:8877/ambassador/v0/check_ready: dial tcp 10.244.5.35:8877: getsockopt: connection refused

From ambassador pod logs nothing useful is displayed.
Removing probes of course will make pods working

mbovo · 2018-02-07T13:57:10Z

@richarddli i can confirm setting initialDelaySeconds to 15 mitigate the issue

richarddli · 2018-02-15T18:57:46Z

I've run into this a few times, and each time it seems that Kube DNS gets messed up. kubernetes/kubernetes#45976 may be related.

richarddli · 2018-03-06T21:24:19Z

@ashish1993 any update on your issue?

@mbovo is ambassador working for you?

We found an issue internally with some of our Kubernetes clusters. See kubernetes/kubeadm#273 and kubernetes/kubernetes#45828. The fix in the issues resolves our problem.

(I'm going to close this issue in a week or so unless there is more data)

chiraggupta06 · 2018-03-20T04:03:59Z

Hi,
We too have tried installing it in kuberenetes but the ambassador pods crash after dome time with the above same error.

plombardi89 · 2018-03-20T10:38:51Z

Thanks for the report @chiraggupta06 can you tell us a little bit more information?

Kubernetes version?
Kubernetes Provisioning tool (Minikube, Kops, Kubeadm etc. ?)
Ambassador version?

chiraggupta06 · 2018-03-20T11:24:46Z

kubernetes version:1.9
provisoning tool: kargo(kubespray)
ambassador version: 0.29.0

kflynn · 2018-05-29T16:33:04Z

We're going to collect these under #437.

richarddli mentioned this issue Feb 5, 2018

Ambassador does not start #224

Closed

jlewi mentioned this issue Mar 6, 2018

ambassadors are crashed and cannot be created kubeflow/kubeflow#344

Closed

pdmack mentioned this issue May 16, 2018

Ambassador failed to start up kubeflow/kubeflow#811

Closed

kflynn closed this as completed May 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue in installating ambassdor over kubernetes #240

Issue in installating ambassdor over kubernetes #240

ashish1993 commented Jan 31, 2018 •

edited

Loading

aroundthecode commented Feb 2, 2018 •

edited

Loading

richarddli commented Feb 5, 2018

richarddli commented Feb 5, 2018 •

edited

Loading

richarddli commented Feb 5, 2018

mbovo commented Feb 7, 2018 •

edited

Loading

mbovo commented Feb 7, 2018

richarddli commented Feb 15, 2018

richarddli commented Mar 6, 2018

chiraggupta06 commented Mar 20, 2018

plombardi89 commented Mar 20, 2018

chiraggupta06 commented Mar 20, 2018

kflynn commented May 29, 2018

Issue in installating ambassdor over kubernetes #240

Issue in installating ambassdor over kubernetes #240

Comments

ashish1993 commented Jan 31, 2018 • edited Loading

aroundthecode commented Feb 2, 2018 • edited Loading

richarddli commented Feb 5, 2018

richarddli commented Feb 5, 2018 • edited Loading

richarddli commented Feb 5, 2018

mbovo commented Feb 7, 2018 • edited Loading

mbovo commented Feb 7, 2018

richarddli commented Feb 15, 2018

richarddli commented Mar 6, 2018

chiraggupta06 commented Mar 20, 2018

plombardi89 commented Mar 20, 2018

chiraggupta06 commented Mar 20, 2018

kflynn commented May 29, 2018

ashish1993 commented Jan 31, 2018 •

edited

Loading

aroundthecode commented Feb 2, 2018 •

edited

Loading

richarddli commented Feb 5, 2018 •

edited

Loading

mbovo commented Feb 7, 2018 •

edited

Loading