Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in installating ambassdor over kubernetes #240

Closed
ashish1993 opened this issue Jan 31, 2018 · 12 comments
Closed

Issue in installating ambassdor over kubernetes #240

ashish1993 opened this issue Jan 31, 2018 · 12 comments

Comments

@ashish1993
Copy link

ashish1993 commented Jan 31, 2018

Hi,
I am trying to install ambassador over kubernetes.

I am getting the ambassdor pods status as CrashLoopBackOff

logs: http://paste.openstack.org/show/658099/
ambassador container logs: http://paste.openstack.org/show/658122/
steps followed for installation:

https://www.getambassador.io/user-guide/getting-started

  1. kubectl apply -f ambassador-service.yaml

ambassador-service.yaml: http://paste.openstack.org/show/658120/

  1. kubectl apply -f https://getambassador.io/yaml/ambassador/ambassador-rbac.yaml

ERROR says:

Readiness probe failed: Get http://10.241.96.13:8877/ambassador/v0/check_ready: dial tcp 10.241.96.13:8877: getsockopt: connection refused

@aroundthecode
Copy link

aroundthecode commented Feb 2, 2018

same here, ambassador 0.23 over k8 1.9.2.
Pods started healthy (I was able to reach abassador admin page) then started crashing upon health check random failure

Edit: removing healtchecks from Deployment manifest (not best approach, I know! :) ) solved the random killing problem and the pods seems to be stable in any case

@richarddli
Copy link
Contributor

OK, some progress on tracking this done. I'm on Minikube, running Kube 1.9.0.

I bumped the liveness/readiness probe initialDelaySeconds and periodSeconds to 15 seconds (from 3). Then I see this in the log:

[2018-02-05 16:31:44.060][24][info][config] source/server/configuration_impl.cc:110] loading stats sink configuration
2018-02-05 16:31:44 kubewatch 0.22.0 INFO: Configuration /etc/ambassador-config-2-envoy.json valid
2018-02-05 16:31:44 kubewatch 0.22.0 INFO: Moved valid configuration /etc/ambassador-config-2-envoy.json to /etc/envoy-2.json
unable to initialize hot restart: previous envoy process is still initializing
starting hot-restarter with target: /application/start-envoy.sh
forking and execing new child process at epoch 0
forked new child process with PID=12
got SIGHUP
forking and execing new child process at epoch 1
forked new child process with PID=25
got SIGCHLD
PID=25 exited with code=1
Due to abnormal exit, force killing all child processes and exiting
force killing PID=12
exiting due to lack of child processes
AMBASSADOR: envoy exited with status 1
Here's the envoy.json we were trying to run with:

When I drop it back down to 3 seconds, I get the connection refused readiness probe errors.

@richarddli
Copy link
Contributor

richarddli commented Feb 5, 2018

In my case, it looks like things are failing because Kube DNS is crashing because of kubernetes/minikube#1722.

@aroundthecode @ashish1993 Could you verify that Kube DNS is working properly on your systems? You can do this with a kubectl get pods -n kube-system. If so, can you paste the logs from your Kube DNS pod (kubectl get logs -n kube-system <kube-dns-pod-name> kubedns).

@richarddli
Copy link
Contributor

My error log:

I0205 16:52:36.008783       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
I0205 16:52:36.507882       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
E0205 16:52:36.676135       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:kube-system:default" cannot list configmaps in the namespace "kube-system"
E0205 16:52:36.676543       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:default" cannot list endpoints at the cluster scope
E0205 16:52:36.678036       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:kube-system:default" cannot list services at the cluster scope
I0205 16:52:37.007929       1 dns.go:174] Waiting for services and endpoints to be initialized from apiserver...
F0205 16:52:37.507812       1 dns.go:168] Timeout waiting for initialization
  FirstSeen	LastSeen	Count	From			SubObjectPath			Type		Reason		Message
  ---------	--------	-----	----			-------------			--------	------		-------
  2d		24m		386	kubelet, minikube	spec.containers{dnsmasq}	Warning		Unhealthy	Liveness probe failed: HTTP probe failed with statuscode: 503
  2d		19m		1514	kubelet, minikube	spec.containers{kubedns}	Warning		BackOff		Back-off restarting failed container
  2d		14m		518	kubelet, minikube	spec.containers{kubedns}	Warning		Unhealthy	Readiness probe failed: Get http://172.17.0.2:8081/readiness: dial tcp 172.17.0.2:8081: getsockopt: connection refused
  2d		4m		1131	kubelet, minikube	spec.containers{dnsmasq}	Warning		BackOff		Back-off restarting failed container

@mbovo
Copy link

mbovo commented Feb 7, 2018

Hi @richarddli i'm working in tandem with @aroundthecode, these are the logs requested:
We have a kubernetes HA cluster with 3 mastes and 4 minions. Etcd is deployed on each master and the apiserver is served using an external loadbalancer.

NAME                                                       READY     STATUS    RESTARTS   AGE
kube-apiserver-itpv-ddkub-i233.facilitylive.int            1/1       Running   0          5d
kube-apiserver-itpv-ddkub-i234.facilitylive.int            1/1       Running   0          5d
kube-apiserver-itpv-ddkub-i235.facilitylive.int            1/1       Running   0          5d
kube-controller-manager-itpv-ddkub-i233.facilitylive.int   1/1       Running   1          13d
kube-controller-manager-itpv-ddkub-i234.facilitylive.int   1/1       Running   0          5d
kube-controller-manager-itpv-ddkub-i235.facilitylive.int   1/1       Running   0          5d
kube-dns-6f4fd4bdf-2jhsq                                   3/3       Running   0          21h
kube-dns-6f4fd4bdf-nfp2k                                   3/3       Running   0          21h
kube-dns-6f4fd4bdf-v9b2v                                   3/3       Running   0          21h
kube-flannel-ds-2f5db                                      1/1       Running   0          5d
kube-flannel-ds-64kpw                                      1/1       Running   0          5d
kube-flannel-ds-67g27                                      1/1       Running   0          5d
kube-flannel-ds-c6x6v                                      1/1       Running   0          5d
kube-flannel-ds-fgctt                                      1/1       Running   0          5d
kube-flannel-ds-g68pj                                      1/1       Running   0          5d
kube-flannel-ds-z5bjx                                      1/1       Running   0          5d
kube-proxy-594ch                                           1/1       Running   0          5d
kube-proxy-rkfkq                                           1/1       Running   0          5d
kube-proxy-snh25                                           1/1       Running   0          13d
kube-proxy-ww2vn                                           1/1       Running   0          5d
kube-proxy-xkkpv                                           1/1       Running   0          5d
kube-proxy-z42bs                                           1/1       Running   0          5d
kube-proxy-zmbdz                                           1/1       Running   0          5d
kube-scheduler-itpv-ddkub-i233.facilitylive.int            1/1       Running   1          13d
kube-scheduler-itpv-ddkub-i234.facilitylive.int            1/1       Running   0          5d
kube-scheduler-itpv-ddkub-i235.facilitylive.int            1/1       Running   0          5d
kubernetes-dashboard-845747bdd4-242gf                      1/1       Running   0          5d
 kubectl logs -n kube-system kube-dns-6f4fd4bdf-2jhsq kubedns                                                                                                         (develop|✚2…) 
I0206 16:29:04.237731       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:29:04.308779       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:29:04.308932       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:29:04.308971       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:29:04.308996       1 server.go:112] FLAG: --config-map=""
I0206 16:29:04.309009       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:29:04.309028       1 server.go:112] FLAG: --config-period="10s"
I0206 16:29:04.309051       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:29:04.309068       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:29:04.309097       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:29:04.309119       1 server.go:112] FLAG: --federations=""
I0206 16:29:04.309141       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:29:04.309156       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:29:04.309167       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:29:04.309178       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:29:04.309200       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:29:04.309244       1 server.go:112] FLAG: --log-dir=""
I0206 16:29:04.309258       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:29:04.309272       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:29:04.309305       1 server.go:112] FLAG: --nameservers=""
I0206 16:29:04.309363       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:29:04.309388       1 server.go:112] FLAG: --v="2"
I0206 16:29:04.309399       1 server.go:112] FLAG: --version="false"
I0206 16:29:04.309417       1 server.go:112] FLAG: --vmodule=""
I0206 16:29:04.309791       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:29:04.310501       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:29:04.310541       1 dns.go:146] Starting endpointsController
I0206 16:29:04.310553       1 dns.go:149] Starting serviceController
I0206 16:29:04.310822       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.310846       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.810965       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:29:04.811022       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:29:04.811062       1 server.go:133] Setting up cache handler (/cache)
I0206 16:29:04.811081       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.252268       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:27:13.102411       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
kubectl logs -n kube-system kube-dns-6f4fd4bdf-nfp2k kubedns                                                                                                         (develop|✚2…) 
I0206 16:29:04.381159       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:29:04.385295       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:29:04.385457       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:29:04.385497       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:29:04.385553       1 server.go:112] FLAG: --config-map=""
I0206 16:29:04.385579       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:29:04.385676       1 server.go:112] FLAG: --config-period="10s"
I0206 16:29:04.385741       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:29:04.385783       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:29:04.385818       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:29:04.385841       1 server.go:112] FLAG: --federations=""
I0206 16:29:04.385873       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:29:04.385883       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:29:04.385898       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:29:04.385910       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:29:04.385919       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:29:04.385941       1 server.go:112] FLAG: --log-dir=""
I0206 16:29:04.385952       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:29:04.385967       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:29:04.385977       1 server.go:112] FLAG: --nameservers=""
I0206 16:29:04.385991       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:29:04.386000       1 server.go:112] FLAG: --v="2"
I0206 16:29:04.386014       1 server.go:112] FLAG: --version="false"
I0206 16:29:04.386030       1 server.go:112] FLAG: --vmodule=""
I0206 16:29:04.386288       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:29:04.386995       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:29:04.387055       1 dns.go:146] Starting endpointsController
I0206 16:29:04.387068       1 dns.go:149] Starting serviceController
I0206 16:29:04.387334       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.387391       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:29:04.887521       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:29:04.887633       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:29:04.887663       1 server.go:133] Setting up cache handler (/cache)
I0206 16:29:04.887677       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.253949       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:27:27.817457       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
kubectl logs -n kube-system kube-dns-6f4fd4bdf-v9b2v kubedns                                                                                                         (develop|✚2…) 
I0206 16:28:59.366665       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0206 16:28:59.486752       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0206 16:28:59.486951       1 server.go:112] FLAG: --alsologtostderr="false"
I0206 16:28:59.487000       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0206 16:28:59.487019       1 server.go:112] FLAG: --config-map=""
I0206 16:28:59.487028       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0206 16:28:59.487041       1 server.go:112] FLAG: --config-period="10s"
I0206 16:28:59.487065       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0206 16:28:59.487083       1 server.go:112] FLAG: --dns-port="10053"
I0206 16:28:59.487110       1 server.go:112] FLAG: --domain="cluster.local."
I0206 16:28:59.487133       1 server.go:112] FLAG: --federations=""
I0206 16:28:59.487154       1 server.go:112] FLAG: --healthz-port="8081"
I0206 16:28:59.487172       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0206 16:28:59.487189       1 server.go:112] FLAG: --kube-master-url=""
I0206 16:28:59.487209       1 server.go:112] FLAG: --kubecfg-file=""
I0206 16:28:59.487223       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0206 16:28:59.487242       1 server.go:112] FLAG: --log-dir=""
I0206 16:28:59.487252       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0206 16:28:59.487261       1 server.go:112] FLAG: --logtostderr="true"
I0206 16:28:59.487270       1 server.go:112] FLAG: --nameservers=""
I0206 16:28:59.487278       1 server.go:112] FLAG: --stderrthreshold="2"
I0206 16:28:59.487288       1 server.go:112] FLAG: --v="2"
I0206 16:28:59.487341       1 server.go:112] FLAG: --version="false"
I0206 16:28:59.487364       1 server.go:112] FLAG: --vmodule=""
I0206 16:28:59.487573       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0206 16:28:59.488385       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0206 16:28:59.488421       1 dns.go:146] Starting endpointsController
I0206 16:28:59.488431       1 dns.go:149] Starting serviceController
I0206 16:28:59.488742       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0206 16:28:59.488764       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0206 16:28:59.988728       1 dns.go:170] Initialized services and endpoints from apiserver
I0206 16:28:59.988785       1 server.go:128] Setting up Healthz Handler (/readiness)
I0206 16:28:59.988815       1 server.go:133] Setting up cache handler (/cache)
I0206 16:28:59.988834       1 server.go:119] Status HTTP port 8081
I0206 17:08:19.250538       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.
I0207 13:28:50.005463       1 dns.go:555] Could not find endpoints for service "prometheus-operated" in namespace "default". DNS records will be created once endpoints show up.

Seems kubedns it's working but ambassador pods still failing with Readiness probe failed: Get http://10.244.5.35:8877/ambassador/v0/check_ready: dial tcp 10.244.5.35:8877: getsockopt: connection refused

From ambassador pod logs nothing useful is displayed.
Removing probes of course will make pods working

@mbovo
Copy link

mbovo commented Feb 7, 2018

@richarddli i can confirm setting initialDelaySeconds to 15 mitigate the issue

@richarddli
Copy link
Contributor

I've run into this a few times, and each time it seems that Kube DNS gets messed up. kubernetes/kubernetes#45976 may be related.

@richarddli
Copy link
Contributor

@ashish1993 any update on your issue?

@mbovo is ambassador working for you?

We found an issue internally with some of our Kubernetes clusters. See kubernetes/kubeadm#273 and kubernetes/kubernetes#45828. The fix in the issues resolves our problem.

(I'm going to close this issue in a week or so unless there is more data)

@chiraggupta06
Copy link

Hi,
We too have tried installing it in kuberenetes but the ambassador pods crash after dome time with the above same error.

@plombardi89
Copy link
Contributor

Thanks for the report @chiraggupta06 can you tell us a little bit more information?

  • Kubernetes version?
  • Kubernetes Provisioning tool (Minikube, Kops, Kubeadm etc. ?)
  • Ambassador version?

@chiraggupta06
Copy link

kubernetes version:1.9
provisoning tool: kargo(kubespray)
ambassador version: 0.29.0

@kflynn
Copy link
Member

kflynn commented May 29, 2018

We're going to collect these under #437.

@kflynn kflynn closed this as completed May 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants