- Vagrant -> Ansible -> Kubernetes
Vagrant does most of the initial setup here. It uses an Ansible provisioner to get a very basic install following the instructions from official docs starting at Installing kubeadm and going as far as Creating a single control-plane cluster with kubeadm. It uses the Docker Container Runtime Interface (CRI) and the Calico Container Network Interface (CNI)
There are multiple assumptions for this walkthough based on my personal setup. You may need to adjust to suit.
- Fedora 32
- vagrant from the fedora repos
- vagrant-libvirt from fedora repos
Install vagrant and vagrant-libvirt and ansible from the standard repos then start the libvirtd service.
sudo dnf -y groupinstall virtualization
sudo dnf -y install vagrant vagrant-libvirt ansible
sudo systemctl enable --now libvirtd.service
Fedora patches vagrant-libvirt to use qemu:///session
which uses a userland networking
system. This breaks private_network
in vagrant-libvirt. In the Vagrantfile
this is set
to use qemu:///system
. If you want to see the machines using the virsh
command you
can either add the option --connect
to your virsh command:
virsh --connect=qemu:///system list --all
Or you can set and environment variable.
export LIBVIRT_DEFAULT_URI=qemu:///system
To make this permanent you can add the above export command to your ~/.bashrc
Create /etc/polkit-1/rules.d/80-libvirt-manage.rules
with the following
content. This will allow people in the wheel
group to use libvirt without
requiring a password.
sudo vim /etc/polkit-1/rules.d/80-libvirt-manage.rules
polkit.addRule(function(action, subject) {
if (action.id == "org.libvirt.unix.manage" && subject.local && subject.active && subject.isInGroup("wheel")) {
return polkit.Result.YES;
}
});
sudo chmod 644 /etc/polkit-1/rules.d/80-libvirt-manage.rules
sudo chown root.root /etc/polkit-1/rules.d/80-libvirt-manage.rules
Run Vagrant to bring up the boxes.
vagrant up
There is a race condition where sometimes the workers try to join before the join command has been exported from the master. If that is the case then re-provision using vagrant.
vagrant provision
kubectl
is the cli management tool for kubernetes. To install the latest
version, download it from google.
export PATH=${PWD}/bin/:$PATH
curl -L --output bin/kubectl https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
chmod +x bin/kubectl
To authenticate to the new keubernetes cluster you will need a kubectl
config
file. This was generated by the ansible provisioning and should now be under the
./etc
directory. kubectl
will look for this in several places. First it
looks for an environment variable, then it looks in ~/.kube
configor it can be passed as a parameter to
kubectl` on each use. Here we use the environment
variable.
export KUBECONFIG=${PWD}/etc/kubeadmin.conf
Next see how the kubernetes cluster is running:
kubectl get nodes
kubectl get all --all-namespaces
Make sure that none of the pods are all in Running
state. If you have some
that are in a CrashLoopBackOff
state you should investigate the logs.
kubectl logs -n kube-system pod/etcd-master-11
See the CNCF instructions or the more detailed instruction README for sonobouy
Once you have sonobouy installed the way to check is just to run it. Running it
with --mode quick
is fast and only runs 1 test to quickly validate a working
cluster, otherwise it can take an hour or so if you use --mode=certified-conformance
.
sonobuoy run --mode=quick
or
sonobuoy run --mode=certified-conformance
When it is complete you can retrieve the results and display them.
results=$(sonobuoy retrieve)
sonobuoy results $results
The output should look similar this with 1 test passed and a lot of skipped:
Plugin: e2e
Status: passed
Total: 4897
Passed: 1
Failed: 0
Skipped: 4896
The dashboard is a pretty interface with some monitoring in it.
The main instructions are here
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.1/aio/deploy/recommended.yaml
Create a user and make it a member of the role cluster-admin (You should use something more secure in production). You can follow the instructions at https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md or use the following commands.
kubectl create serviceaccount -n kubernetes-dashboard admin-user
kubectl create clusterrolebinding admin-user --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:admin-user
Get the secret token to access the dashboard which is required to log into the dashboard.
kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
Kubectl has a proxy in it to make accessing the web interface of the cluster possible in a secure way. Most of it is the API but services can quite often be accessed in this way.
kubectl proxy
The second way is to create a port forward. This way often works best as the application
may be using redirects that break using kubectl proxy
. This command requires the node to
have 'socat' installed.
kubectl --namespace kubernetes-dashboard port-forward svc/kubernetes-dashboard 8443:443
The original application for this was named heapster but it has now been deprecated. This is being replaced by a new service named metrics-server but like all things in k8s this is interchangable with other solutions such as prometheus.
Provided below are installation instructions for both Prometheus and metrics-server. You can only choose one. Prometheus is a much more comprehensive monitoring system and in most cases it is the prefered option but if you are in a resource constrained environment then metrics-server may be your prefered option.
The kube-prometheus project makes prometheus run nativly on kubernetes. These instructions are based on the quickstart from the kube-prometheus README which you should read for further details.
Download and install kube-prometheus
git clone https://github.com/coreos/kube-prometheus tmp/kube-prometheus
kubectl create -f tmp/kube-prometheus/manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f tmp/kube-prometheus/manifests/
You can then access it from you local machine using a portforward
kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
Open you browser to http://127.0.0.1:9090
You can access the rest of the services in a similar way.
kubectl --namespace monitoring port-forward svc/grafana 3000
http://127.0.0.1:3000kubectl --namespace monitoring port-forward service/alertmanager-main 9093
http://127.0.0.1:9093
Metrics Server is a cluster-wide aggregator of resource usage data.
To install the metrics server, clone the git repo and then checkout the latest revision.
To list the available revisions use git tag -l
.
git clone https://github.com/kubernetes-sigs/metrics-server ./tmp/metrics-server
cd ./tmp/metrics-server
git checkout v0.3.6
Once the correct revision is checked out you can create the server in kubernetes with the following command:
kubectl create -f deploy/1.8+/
The instructions are at https://rook.io/docs/rook/v1.3/ceph-quickstart.html
For rock-ceph to work you need either a raw block device, raw partition or blank lvm pv.
In my experience the main thing that causes issues here is networking internally to kubernetes.
The TL;DR for the rook quickstart is the following commands:
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/common.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/operator.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/cluster-test.yaml
If you want a production cluster then you need In this case you should read the docs in
more detail at the quickstart link above and use this cluster.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/cluster.yaml
These take a little while to get bootstrapped. You can check the state of the pods using:
kubectl get all -n rook-ceph
Once the Ceph cluster is running you can access it using the kubectl proxy
Use port
https:rook-ceph-mgr-dashboard:8443
(because dashboard is already on 8443) if you used
cluster.yaml and http:rook-ceph-mgr-dashboard:7000
if you used cluster-test.yaml:
This is an example of application not working correctly using the proxy due to the way the
Ceph Dashboard redirects on login we should use the port-forward method instead. Use port
9443:8443
(because dashboard is already on 8443) if you used cluster.yaml and 7000
if you used cluster-test.yaml
kubectl --namespace rook-ceph port-forward service/rook-ceph-mgr-dashboard 7000
The default username is admin
and the password is randomly generated and stored in
Kubernetes. It can be extracted using the following:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
Troubleshooting
If you need to remove the Ceph cluster for some reason you can follow the following instructions.
Here is the TL:DR
First you need to delete any persistant volumes and claims. If you have done the wordpress example the these are the commands
kubectl delete -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/mysql.yaml
kubectl delete -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/wordpress.yaml
Delete the Block and File artifacts
kubectl delete -n rook-ceph cephblockpool replicapool
kubectl delete storageclass rook-ceph-block
kubectl delete -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/csi/cephfs/kube-registry.yaml
kubectl delete storageclass csi-cephfs
Delete the CephCluster CRD. If you used cluster-test.yaml the cluster name
is my-cluster
, if you used cluster.yaml it is named rook-ceph
# Pick the correct one of these!!!
kubectl -n rook-ceph delete cephcluster my-cluster
# or
kubectl -n rook-ceph delete cephcluster rook-ceph
Verify that the pods and cluster CRD has been deleted before continuing to the next step.
kubectl -n rook-ceph get cephcluster
When verifing the pods you will still see the operator pods but all the rook-ceph-mon
,
rook-ceph-mgr
and rook-ceph-osd
should be gone.
kubectl -n rook-ceph get pod
Your output should look similar to this.
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-2lmpg 3/3 Running 0 38m
csi-cephfsplugin-5wglt 3/3 Running 0 38m
csi-cephfsplugin-ck9qg 3/3 Running 0 38m
csi-cephfsplugin-provisioner-7469b99d4b-wnwh5 5/5 Running 0 38m
csi-cephfsplugin-provisioner-7469b99d4b-ztkl9 5/5 Running 0 38m
csi-rbdplugin-9xjzm 3/3 Running 0 38m
csi-rbdplugin-dtmvs 3/3 Running 0 38m
csi-rbdplugin-provisioner-865f4d8d-jvxdk 6/6 Running 0 38m
csi-rbdplugin-provisioner-865f4d8d-xb9vd 6/6 Running 0 38m
csi-rbdplugin-rkrqk 3/3 Running 0 38m
rook-ceph-operator-5b6674cb6-qhcjj 1/1 Running 0 38m
rook-discover-2mcmh 1/1 Running 0 38m
rook-discover-2tnhq 1/1 Running 0 38m
rook-discover-8cf4w 1/1 Running 0 38m
Delete the Operator and related Resources
kubectl delete -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/operator.yaml
kubectl delete -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/common.yaml
Check that all the operator pods get deleted and there is nothing left in the rook-ceph
namespace.
kubectl -n rook-ceph get all
Delete the metadata on hosts.
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -m file -a 'dest=/var/lib/rook/rook-ceph state=absent'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -m file -a 'dest=/var/lib/rook/mon-a state=absent'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -m file -a 'dest=/var/lib/rook/mon-b state=absent'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -m file -a 'dest=/var/lib/rook/mon-c state=absent'
Wipe the disks and reinitialize the block devices.
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -a 'dd if=/dev/zero of="/dev/vdb" bs=1M count=100 oflag=direct,dsync'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -a 'bash -c "ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %"'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -a 'rm -rf /dev/ceph-*'
ansible --become -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -a 'lsblk'
Check that it is all cleared up.
ansible -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory worker -a 'ls /var/lib/rook/'
See the Troubleshooting section at the bottom of this page if you run into issues.
To check the status of the Ceph cluster you can deploy the rook-ceph-toolbox. Remember to change to the page that matches the version you have installed.
Here is a TL;DR for rook-ceph-toolbox:
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/toolbox.yaml
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bash
The quick commands to get an overview are:
ceph status
ceph osd status
ceph df
rados df
Once you are done you can delete the toolbox:
kubectl -n rook-ceph delete deployment rook-ceph-tools
The following will create a rook-ceph-block
StorageClass which can be used by
applications. These will be ext4 by default so if you want to change it then
download the yamls and edit before creating.
There are 2 different yamls available at https://github.com/rook/rook/tree/release-1.3/cluster/examples/kubernetes/ceph/csi/rbd
storageclass.yaml
which requres 3 replicas and storageclass-test.yaml
which
is designed for testing on a single node. Pick one and create it.
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
or
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/ceph/csi/rbd/storageclass-test.yaml
Some sample apps are provided with rook for testing.
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/mysql.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.3/cluster/examples/kubernetes/wordpress.yaml
You can see the state of the apps using kubectl get all
as these will be
deployed in the default namespace.
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/wordpress-5b886cf59b-97rfl 0/1 ContainerCreating 0 114s
pod/wordpress-mysql-b9ddd6d4c-2sklp 1/1 Running 0 115s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 126m
service/wordpress LoadBalancer 10.101.90.166 <pending> 80:30118/TCP 114s
service/wordpress-mysql ClusterIP None <none> 3306/TCP 116s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/wordpress 0/1 1 0 114s
deployment.apps/wordpress-mysql 1/1 1 1 115s
NAME DESIRED CURRENT READY AGE
replicaset.apps/wordpress-5b886cf59b 1 1 0 114s
replicaset.apps/wordpress-mysql-b9ddd6d4c 1 1 1 115s
When everything Running
you can access the WordPress install via kubectl proxy
:
http://localhost:8001/api/v1/namespaces/default/services/http:wordpress:/proxy/
You may have notices with the WordPress example that the EXTERNAL-IP of the
service/wordpress
is sitting in <pending>
state. This is because it is setup
to use a load-balancer. The next section will cover creating a load balancer.
Kubernetes doesn't supply a LoadBalancer for clusters that aren't running on cloud platforms. The implementations they do have expect to be able to speak to a cloud platform load-balancer. Luckily for us there is a project MetalLB which provides just what we need.
There are a bunch of caveats with MetalLB which affect cloud providers and some of the CNIs so checkout the MetalLB compatibility docs if you are following these instructions but have made some changes. In this example the decision earlier to use Calico works fine because we wont be using BGP.
Installation is a simple apply of a manifest:
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
# On first install only
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
MetalLB can use ARP or BGP to manage the address space. The simpler of the two and one that suits this example is ARP.
We need to create a manifest and give MetalLB some addresses to manage. Create
the following file as ./tmp/metallb_address-pool.yaml. The address range is the
setup in the Vagrantfile
as the private network metallb
. We dont need to
actually bind the ipaddresses to the interfaces in the nodes because metallb
will respond to arp requests and do this all at layer2.
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.200.100-192.168.200.250
Next apply it using kubectl
kubectl apply -f tmp/metallb_address-pool.yaml
Hopefully if everything has gone to plan the wordpress service should now have an exteral IP address that can be accessed from your workstation.
You can check this by getting the services:
$ kubectl get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h59m <none>
wordpress LoadBalancer 10.96.130.76 192.168.200.100 80:31814/TCP 13m app=wordpress,tier=frontend
wordpress-mysql ClusterIP None <none> 3306/TCP 13m app=wordpress,tier=mysql
In this instance NAT has been used but it should be possible to use macvtap to make this accessable from other machines in the network.
This is not necessary for everything above but sometimes it is useful, especially because it is not simeple to copy files from a vagrant node to the host. These are the Vagrant docs for more details.
Inside the Vagrantfile
you can share this directory with each of the nodes.
The share is currently disabled so to activate it you need to edit the Vagrantfile
and change the share to disabled: false
To allow the users in the wheel
group to configure NFS shares for vagrant
create the file /etc/sudoers.d/vagrant-syncedfolders
with the following
content.
sudo vim /etc/sudoers.d/vagrant-syncedfolders
Cmnd_Alias VAGRANT_EXPORTS_CHOWN = /bin/chown 0\:0 /tmp/*
Cmnd_Alias VAGRANT_EXPORTS_MV = /bin/mv -f /tmp/* /etc/exports
Cmnd_Alias VAGRANT_NFSD_CHECK = /usr/bin/systemctl status --no-pager nfs-server.service
Cmnd_Alias VAGRANT_NFSD_START = /usr/bin/systemctl start nfs-server.service
Cmnd_Alias VAGRANT_NFSD_APPLY = /usr/sbin/exportfs -ar
%wheel ALL=(root) NOPASSWD: VAGRANT_EXPORTS_CHOWN, VAGRANT_EXPORTS_MV, VAGRANT_NFSD_CHECK, VAGRANT_NFSD_START, VAGRANT_NFSD_APPLY
sudo chmod 644 /etc/sudoers.d/vagrant-syncedfolders
sudo chown root.root /etc/sudoers.d/vagrant-syncedfolders
If you are having dificulty with NFS not mounting you may need to allow it through the firewall on the host.
On Fedora >= 31 you need to allow it through the libvirt
firewalld zone
Allow NFS through the libvirt
zone.
firewall-cmd --zone=libvirt --list-all
firewall-cmd --permanent --zone=libvirt --add-service=nfs
firewall-cmd --permanent --zone=libvirt --add-service=mountd
firewall-cmd --permanent --zone=libvirt --add-service=rpc-bind
firewall-cmd --permanent --zone=libvirt --add-port=2049/tcp
firewall-cmd --permanent --zone=libvirt --add-port=2049/udp
firewall-cmd --reload
firewall-cmd --zone=libvirt --list-all