Skip to content
This repository has been archived by the owner on Nov 9, 2017. It is now read-only.

Startup waits forever #4

Closed
theganyo opened this issue Jan 26, 2016 · 13 comments
Closed

Startup waits forever #4

theganyo opened this issue Jan 26, 2016 · 13 comments

Comments

@theganyo
Copy link

When performing "up" on the system, I see the following:

Starting k8smaster-01 VM ...
> booting k8smaster-01
[corectl] stable/835.11.0 already available on your system
[corectl] '/Users/sganyo' was already available to VMs via NFS
[corectl] started 'k8smaster-01' in background with IP 192.168.64.3 and PID 14695

Starting k8snode-01 VM ...
> booting k8snode-01
[corectl] stable/835.11.0 already available on your system
[corectl] '/Users/sganyo' was already available to VMs via NFS
[corectl] started 'k8snode-01' in background with IP 192.168.64.4 and PID 14723

Starting k8snode-02 VM ...
> booting k8snode-02
[corectl] stable/835.11.0 already available on your system
[corectl] '/Users/sganyo' was already available to VMs via NFS
[corectl] started 'k8snode-02' in background with IP 192.168.64.5 and PID 14741


Waiting for k8smaster-01 to be ready...

fleetctl list-machines:
MACHINE     IP      METADATA
012edea6... 192.168.64.3    role=control

Waiting for Kubernetes cluster to be ready. This can take a few minutes...

But then it waits forever. Ideas?

@rimusz
Copy link
Member

rimusz commented Jan 26, 2016

fixed in v0.1.4

@rimusz rimusz closed this as completed Jan 26, 2016
@theganyo
Copy link
Author

Great! Thanks!

BTW: The Kubernetes-UI still doesn't run. That's less important, though. :)

@rimusz
Copy link
Member

rimusz commented Jan 26, 2016

@theganyo it does work for me, after setup it takes a few minutes to bootstrap it, as to download docker images

@theganyo
Copy link
Author

I destroyed and recreated the VM. It's all good. Maybe it was still broken from the failed attempt earlier. Thanks again!

@cobrowserAlex
Copy link

Just downloaded the latest v1.0.6 and see it waiting for 20 minutes now. I did have a working kube-solo before i removed. I'm on El Capitan 10.11.1, Dunno if that matters.

Starting all fleet units in ~/kube-cluster/fleet:
Unit fleet-ui.service inactive
Unit fleet-ui.service launched on fc4edcaf.../192.168.64.4
Unit kube-apiserver.service inactive
Unit kube-apiserver.service launched on fc4edcaf.../192.168.64.4
Unit kube-controller-manager.service inactive
Unit kube-controller-manager.service launched on fc4edcaf.../192.168.64.4
Unit kube-scheduler.service inactive
Unit kube-scheduler.service launched on fc4edcaf.../192.168.64.4
Unit kube-apiproxy.service
Triggered global unit kube-apiproxy.service start
Unit kube-kubelet.service
Triggered global unit kube-kubelet.service start
Unit kube-proxy.service
Triggered global unit kube-proxy.service start

fleetctl list-units:
UNIT                MACHINE             ACTIVE      SUB
fleet-ui.service        fc4edcaf.../192.168.64.4    inactive    dead
kube-apiserver.service      fc4edcaf.../192.168.64.4    inactive    dead
kube-controller-manager.service fc4edcaf.../192.168.64.4    inactive    dead
kube-scheduler.service      fc4edcaf.../192.168.64.4    inactive    dead


Waiting for Kubernetes cluster to be ready. This can take a few minutes...

@rimusz
Copy link
Member

rimusz commented Jan 27, 2016

interesting, you can check what is going on with Kubernetes fleet units by opening Preset OS shell then type $ fleetctl status kube-apiserver.service please
and post the results
Maybe the k8s go binaries have not been copied to VMs, the install got interrupted.
Try to run via Updates - > Update kubernetes to the latest version

@cobrowserAlex
Copy link

hmm, i removed .core* and .kube and kube* and did a reinstall, now it works:

Waiting for Kubernetes cluster to be ready. This can take a few minutes...
/
node "192.168.64.5" labeled
node "192.168.64.6" labeled

Creating kube-system namespace ...
namespace "kube-system" created

Installing SkyDNS ...
replicationcontroller "kube-dns-v10" created
service "kube-dns" created

Installing Kubernetes UI ...
replicationcontroller "kube-ui-v4" created
service "kube-ui" created

fleetctl list-machines:
MACHINE     IP      METADATA
0668e53d... 192.168.64.6    role=node
1c262d41... 192.168.64.5    role=node
f17ec4e0... 192.168.64.4    role=control

fleetctl list-units:
UNIT                MACHINE             ACTIVE  SUB
fleet-ui.service        f17ec4e0.../192.168.64.4    active  running
kube-apiproxy.service       0668e53d.../192.168.64.6    active  running
kube-apiproxy.service       1c262d41.../192.168.64.5    active  running
kube-apiserver.service      f17ec4e0.../192.168.64.4    active  running
kube-controller-manager.service f17ec4e0.../192.168.64.4    active  running
kube-kubelet.service        0668e53d.../192.168.64.6    active  running
kube-kubelet.service        1c262d41.../192.168.64.5    active  running
kube-proxy.service      0668e53d.../192.168.64.6    active  running
kube-proxy.service      1c262d41.../192.168.64.5    active  running
kube-scheduler.service      f17ec4e0.../192.168.64.4    active  running

kubectl get nodes:
NAME           LABELS                                             STATUS    AGE
192.168.64.5   kubernetes.io/hostname=192.168.64.5,node=worker1   Ready     1s
192.168.64.6   kubernetes.io/hostname=192.168.64.6,node=worker2   Ready     18s

Installation has finished, Kube Cluster VMs are up and running !!!

Assigned static IP for master VM: 192.168.64.4
Assigned static IP for node1 VM: 192.168.64.5
Assigned static IP for node2 VM: 192.168.64.6

You can control this App via status bar icon...

Also you can install Deis PaaS (http://deis.io) v2 alpha version with 'install_deis' command ...

The kube-solo worked at the https://2015.distributed-matters.org/bcn/ but wouldn't start after anymore...

@rimusz
Copy link
Member

rimusz commented Jan 27, 2016

@cobrowserAlex hmm, interesting where kube-cluster misbehaved.
Would be good to reproduce it.

kube-solo has very involved/changed since distributed-matters.org/bcn/, have you tried the latest version?

@cobrowserAlex
Copy link

Ha, I closed my macbook so I guess it went to sleep. When I opened again it had crashed.
After i tried kube-cluster and I get the same behaviour as before,
"Waiting for Kubernetes cluster to be ready. This can take a few minutes..."

@rimusz
Copy link
Member

rimusz commented Jan 27, 2016

@AntonioMeireles is any way we can have a fix which should not crash VMs when Mac comes back from sleep with the left running VMs?

@AntonioMeireles
Copy link
Member

@rimusz looking at it.

@cobrowserAlex
Copy link

It seems 'suspend' is not the issue, because putting to sleep and waking up the mac (instant start) while vms are running works for me. It is the hibernation and restart that results in 'waiting for Kubernetes cluster....'
When the 'waiting message is shown I can enter the 'Preset OS shell' and get the kube api server status:

bash-3.2$ fleetctl status kube-apiserver.service
● kube-apiserver.service - Kubernetes API Server
   Loaded: loaded (/run/fleet/units/kube-apiserver.service; linked-runtime; vendor preset: disabled)
   Active: inactive (dead)
     Docs: https://github.com/GoogleCloudPlatform/kubernetes

then try to start it

bash-3.2$ fleetctl start kube-apiserver.service
bash-3.2$ fleetctl status kube-apiserver.service
● kube-apiserver.service - Kubernetes API Server
   Loaded: loaded (/run/fleet/units/kube-apiserver.service; linked-runtime; vendor preset: disabled)
   Active: inactive (dead)
     Docs: https://github.com/GoogleCloudPlatform/kubernetes

restore / update fleet units:

redeploying fleet units:

Destroying Kubernetes fleet units ...
Destroyed kube-apiserver.service
Destroyed kube-controller-manager.service
Destroyed kube-scheduler.service
Destroyed kube-apiproxy.service
Destroyed kube-kubelet.service
Destroyed kube-proxy.service

Starting Kubernetes fleet units ...
Unit kube-apiserver.service inactive
Unit kube-apiserver.service launched on 3ae54d82.../192.168.64.4
Unit kube-controller-manager.service inactive
Unit kube-controller-manager.service launched on 3ae54d82.../192.168.64.4
Unit kube-scheduler.service inactive
Unit kube-scheduler.service launched on 3ae54d82.../192.168.64.4
Unit kube-apiproxy.service
Triggered global unit kube-apiproxy.service start
Unit kube-kubelet.service
Triggered global unit kube-kubelet.service start
Unit kube-proxy.service
Triggered global unit kube-proxy.service start

fleetctl list-units:
UNIT                MACHINE             ACTIVE      SUB
fleet-ui.service        3ae54d82.../192.168.64.4    inactive    dead
kube-apiproxy.service       4b48b3c6.../192.168.64.6    active      running
kube-apiproxy.service       b34f12af.../192.168.64.5    active      running
kube-apiserver.service      3ae54d82.../192.168.64.4    inactive    dead
kube-controller-manager.service 3ae54d82.../192.168.64.4    inactive    dead
kube-kubelet.service        4b48b3c6.../192.168.64.6    activating  start-pre
kube-kubelet.service        b34f12af.../192.168.64.5    activating  start-pre
kube-proxy.service      4b48b3c6.../192.168.64.6    activating  start-pre
kube-proxy.service      b34f12af.../192.168.64.5    activating  start-pre
kube-scheduler.service      3ae54d82.../192.168.64.4    inactive    dead

Waiting for Kubernetes cluster to be ready. This can take a few minutes...
error: couldn't read version from server: Get http://192.168.64.4:8080/api: dial tcp 192.168.64.4:8080: connection refused
/Applications/Kube-Cluster.app/Contents/Resources/restore_update_fleet_units.command: line 65: spin: i++%0: division by 0 (error token is "0")
error: couldn't read version from server: Get http://192.168.64.4:8080/api: dial tcp 192.168.64.4:8080: connection refused
|error: couldn't read version from server: Get http://192.168.64.4:8080/api: dial tcp 192.168.64.4:8080: connection refused
/error: couldn't read version from server: Get http://192.168.64.4:8080/api: dial tcp 192.168.64.4:8080: connection refused
etc...

@cobrowserAlex
Copy link

Problem Solved: I have an openVPN connection running for work, that routes all traffic over vpn. If I start the kube-cluster after the VPN is up, it's waiting forever to connect. When I kill the openVPN connection, it works as expected.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants