In this section, we will review how to launch and manage applications using StatefulSets and Persistent Volumes.
We will review how to deploy MySQL database using StatefulSets and EBS volumes. The example is a MySQL single-master topology with multiple slaves running asynchronous replication.
The example consists of ConfigMap, two MySQL services and a StatefulSet. We will deploy MySQL database, send some traffic to test connection status, go through few failure modes and review resiliency that is built into the StatefulSet. Lastly, we’ll demonstrate how to use scale options with StatefulSet.
This chapter uses a cluster with 3 master nodes and 5 worker nodes as described here: multi-master, multi-node gossip based cluster.
All configuration files for this chapter are in the statefulsets
directory. Make sure you change to that directory before giving any commands in this chapter.
Using ConfigMap, you can independently control MySQL configuration. The ConfigMap looks like as shown:
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-config
labels:
app: mysql
data:
master.cnf: |
# Apply this config only on the master.
[mysqld]
log-bin
slave.cnf: |
# Apply this config only on slaves.
[mysqld]
super-read-only
In this case, we are using master to serve replication logs to slave and slaves are read-only. Create the ConfigMap using the command shown:
$ kubectl create -f templates/mysql-configmap.yaml configmap "mysql-config" created
Create two headless services using the following configuration:
# Headless service for stable DNS entries of StatefulSet members.
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- name: mysql
port: 3306
clusterIP: None
selector:
app: mysql
---
# Client service for connecting to any MySQL instance for reads.
# For writes, you must instead connect to the master: mysql-0.mysql.
apiVersion: v1
kind: Service
metadata:
name: mysql-read
labels:
app: mysql
spec:
ports:
- name: mysql
port: 3306
selector:
app: mysql
The mysql
service is used for DNS resolution so that when pods are placed by StatefulSet controller, pods can be resolved using <pod-name>.mysql
. mysql-read
is a client service that does load balancing for all slaves.
$ kubectl create -f templates/mysql-services.yaml service "mysql" created service "mysql-read" created
Only read queries can use the load-balanced mysql-read
service. Because there is only one MySQL master, clients should connect directly to the MySQL master Pod, identified by mysql-0.mysql
, to execute writes.
Finally, we create StatefulSet using the configuration in templates/mysql-statefulset.yaml
using the command shown:
$ kubectl create -f templates/mysql-statefulset.yaml statefulset "mysql" created
$ kubectl get -w statefulset NAME DESIRED CURRENT AGE mysql 3 1 8s mysql 3 2 59s mysql 3 3 2m mysql 3 3 3m
In a different terminal window, wou can watch the progress of pods creation using the following command:
$ kubectl get pods -l app=mysql --watch NAME READY STATUS RESTARTS AGE mysql-0 0/2 Init:0/2 0 30s mysql-0 0/2 Init:1/2 0 35s mysql-0 0/2 PodInitializing 0 47s mysql-0 1/2 Running 0 48s mysql-0 2/2 Running 0 59s mysql-1 0/2 Pending 0 0s mysql-1 0/2 Pending 0 0s mysql-1 0/2 Pending 0 0s mysql-1 0/2 Init:0/2 0 0s mysql-1 0/2 Init:1/2 0 35s mysql-1 0/2 Init:1/2 0 45s mysql-1 0/2 PodInitializing 0 54s mysql-1 1/2 Running 0 55s mysql-1 2/2 Running 0 1m mysql-2 0/2 Pending 0 <invalid> mysql-2 0/2 Pending 0 <invalid> mysql-2 0/2 Pending 0 0s mysql-2 0/2 Init:0/2 0 0s mysql-2 0/2 Init:1/2 0 32s mysql-2 0/2 Init:1/2 0 43s mysql-2 0/2 PodInitializing 0 50s mysql-2 1/2 Running 0 52s mysql-2 2/2 Running 0 56s
Press Ctrl
+C
to stop watching. If you notice, the pods are initialized in an orderly fashion in their
startup process. The reason being StatefulSet controller assigns a unique, stable name (mysql-0
,
mysql-1
, mysql-2
) with mysql-0
being the master and others being slaves. The configuration uses Percona
Xtrabackup (open-source tool) to clone source MySQL server to its slaves.
You can use mysql-client
to send some data to the master (mysql-0.mysql
)
kubectl run mysql-client --image=mysql:5.7 -i --rm --restart=Never --\
mysql -h mysql-0.mysql <<EOF
CREATE DATABASE test;
CREATE TABLE test.messages (message VARCHAR(250));
INSERT INTO test.messages VALUES ('hello, from mysql-client');
EOF
You can run the following to test if slaves (mysql-read
) received the data
$ kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never --\
mysql -h mysql-read -e "SELECT * FROM test.messages"
This should display an output like this:
+--------------------------+
| message |
+--------------------------+
| hello, from mysql-client |
+--------------------------+
To test load balancing across slaves, you can run the following command:
kubectl run mysql-client-loop --image=mysql:5.7 -i -t --rm --restart=Never --\ bash -ic "while sleep 1; do mysql -h mysql-read -e 'SELECT @@server_id,NOW()'; done"
Each MySQL instance is assigned a unique identifier, and it can be retrieved using @@server_id
. This command prints the server id serving the request and the timestamp in an infinite loop.
This command will show the output:
+-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 100 | 2017-10-24 03:01:11 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 100 | 2017-10-24 03:01:12 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 102 | 2017-10-24 03:01:13 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 101 | 2017-10-24 03:01:14 | +-------------+---------------------+
You can leave this open in a separate window while you run failure modes in the next section.
Alternatively, you can use Ctrl
+C
to terminate the loop.
We will see how StatefulSet behave in different failure modes. The following modes will be tested:
-
Unhealthy container
-
Failed pod
-
Failed node
MySQL container uses readiness probe by running mysql -h 127.0.0.1 -e 'SELECT 1'
on the server to make sure MySQL server is still active.
Run this command to simulate MySQL as being unresponsive:
kubectl exec mysql-2 -c mysql -- mv /usr/bin/mysql /usr/bin/mysql.off
This command renames the /usr/bin/mysql
command so that readiness probe cannot find it. After a few seconds, during the next health check, the Pod should report one of its containers is not healthy. This can be verified using the command:
kubectl get pod mysql-2 NAME READY STATUS RESTARTS AGE mysql-2 1/2 Running 0 12m
mysql-read
load balancer detects failures like this and takes action by not sending traffic to failed containers. You can check this if you have the loop running in separate window. The loop shows the following output:
+-------------+---------------------+
| @@server_id | NOW() |
+-------------+---------------------+
| 101 | 2017-10-24 03:17:09 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW() |
+-------------+---------------------+
| 101 | 2017-10-24 03:17:10 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW() |
+-------------+---------------------+
| 100 | 2017-10-24 03:17:11 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW() |
+-------------+---------------------+
| 100 | 2017-10-24 03:17:12 |
+-------------+---------------------+
Revert back to its initial state
kubectl exec mysql-2 -c mysql -- mv /usr/bin/mysql.off /usr/bin/mysql
Check the status again to see that both the pods are running and healthy:
$ kubectl get pod -w mysql-2 NAME READY STATUS RESTARTS AGE mysql-2 2/2 Running 0 5h
And the loop is now also showing all three servers.
To simulate a failed pod, you can delete a pod as shown:
kubectl delete pod mysql-2 pod "mysql-2" deleted
StatefulSet controller recognizes failed pods and creates a new one with same name and link to the same PersistentVolumeClaim.
$ kubectl get pod -w mysql-2 NAME READY STATUS RESTARTS AGE mysql-2 0/2 Init:0/2 0 28s mysql-2 0/2 Init:1/2 0 31s mysql-2 0/2 PodInitializing 0 32s mysql-2 1/2 Running 0 33s mysql-2 2/2 Running 0 37s
Kubernetes allows a node to be marked unschedulable using the kubectl drain
command. This prevents any new pods to be scheduled on this node. If the API server supports eviction, then it will evict the pods. Otherwise, it will delete all the pods. The evict and delete happens for all the pods except mirror pods (which cannot be deleted through API server). Read more about drain at https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/.
You can simulate node downtime by draining the node. In order to determine which node to drain, run this command
$ kubectl get pod mysql-2 -o wide NAME READY STATUS RESTARTS AGE IP NODE mysql-2 2/2 Running 0 11m 100.96.6.12 ip-172-20-64-152.ec2.internal
Drain the node using the command:
$ kubectl drain ip-172-20-64-152.ec2.internal --force --delete-local-data --ignore-daemonsets node "ip-172-20-64-152.ec2.internal" cordoned WARNING: Deleting pods with local storage: mysql-2; Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: kube-proxy-ip-172-20-64-152.ec2.internal pod "kube-dns-479524115-76s6j" evicted pod "mysql-2" evicted node "ip-172-20-64-152.ec2.internal" drained
You can look at the list of nodes:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-172-20-107-81.ec2.internal Ready node 10h v1.7.4 ip-172-20-122-243.ec2.internal Ready master 10h v1.7.4 ip-172-20-125-181.ec2.internal Ready node 10h v1.7.4 ip-172-20-37-239.ec2.internal Ready master 10h v1.7.4 ip-172-20-52-200.ec2.internal Ready node 10h v1.7.4 ip-172-20-57-5.ec2.internal Ready node 10h v1.7.4 ip-172-20-64-152.ec2.internal Ready,SchedulingDisabled node 10h v1.7.4 ip-172-20-76-117.ec2.internal Ready master 10h v1.7.4
Notice how scheduling is disabled on one node.
Now you can watch Pod reschedules
kubectl get pod mysql-2 -o wide --watch
The output always stay at:
NAME READY STATUS RESTARTS AGE IP NODE mysql-2 0/2 Pending 0 33s <none> <none>
This could be a bug in StatefulSet as the pod was failing to reschedule. The reason was, there was no other nodes running in the AZ where the original node failed. The EBS volume was failing to to attach to other nodes because of different AZ restriction.
To mitigate this issue, manually scale the nodes to 6 which resulted in an additional node being available in that AZ. Your scenario could be different and may not need this step.
Edit number of nodes to 6
if you run into Pending
issue:
kops edit ig nodes
Change the specification to:
spec: image: kope.io/k8s-1.7-debian-jessie-amd64-hvm-ebs-2017-07-28 machineType: t2.medium maxSize: 6 minSize: 6 role: Node subnets: - us-east-1a - us-east-1b - us-east-1c
Review and commit changes:
kops update cluster --yes
It takes a few minutes for a new node to be provisioned. This can be verified using the command shown:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-172-20-107-81.ec2.internal Ready node 10h v1.7.4 ip-172-20-122-243.ec2.internal Ready master 10h v1.7.4 ip-172-20-125-181.ec2.internal Ready node 10h v1.7.4 ip-172-20-37-239.ec2.internal Ready master 10h v1.7.4 ip-172-20-52-200.ec2.internal Ready node 10h v1.7.4 ip-172-20-57-5.ec2.internal Ready node 10h v1.7.4 ip-172-20-64-152.ec2.internal Ready,SchedulingDisabled node 10h v1.7.4 ip-172-20-73-181.ec2.internal Ready node 1m v1.7.4 ip-172-20-76-117.ec2.internal Ready master 10h v1.7.4
Now you can watch the status of the pod:
$ kubectl get pod mysql-2 -o wide NAME READY STATUS RESTARTS AGE IP NODE mysql-2 2/2 Running 0 11m 100.96.8.2 ip-172-20-73-181.ec2.internal
Let’s put the previous node back into normal state:
$ kubectl uncordon ip-172-20-64-152.ec2.internal node "ip-10-10-71-96.ec2.internal" uncordoned
The list of nodes is now shown as:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-172-20-107-81.ec2.internal Ready node 10h v1.7.4 ip-172-20-122-243.ec2.internal Ready master 10h v1.7.4 ip-172-20-125-181.ec2.internal Ready node 10h v1.7.4 ip-172-20-37-239.ec2.internal Ready master 10h v1.7.4 ip-172-20-52-200.ec2.internal Ready node 10h v1.7.4 ip-172-20-57-5.ec2.internal Ready node 10h v1.7.4 ip-172-20-64-152.ec2.internal Ready node 10h v1.7.4 ip-172-20-73-181.ec2.internal Ready node 3m v1.7.4 ip-172-20-76-117.ec2.internal Ready master 10h v1.7.4
More slaves can be added to the MySQL cluster to increase the read query capacity. This can be done using the command shown:
$ kubectl scale statefulset mysql --replicas=5 statefulset "mysql" scaled
Of course, you can watch the progress of scaling
kubectl get pods -l app=mysql -w
It shows the output:
$ kubectl get pods -l app=mysql -w NAME READY STATUS RESTARTS AGE mysql-0 2/2 Running 0 6h mysql-1 2/2 Running 0 6h mysql-2 2/2 Running 0 16m mysql-3 0/2 Init:0/2 0 1s mysql-3 0/2 Init:1/2 0 18s mysql-3 0/2 Init:1/2 0 28s mysql-3 0/2 PodInitializing 0 36s mysql-3 1/2 Running 0 37s mysql-3 2/2 Running 0 43s mysql-4 0/2 Pending 0 <invalid> mysql-4 0/2 Pending 0 <invalid> mysql-4 0/2 Pending 0 0s mysql-4 0/2 Init:0/2 0 0s mysql-4 0/2 Init:1/2 0 31s mysql-4 0/2 Init:1/2 0 41s mysql-4 0/2 PodInitializing 0 52s mysql-4 1/2 Running 0 53s mysql-4 2/2 Running 0 58s
If the loop is still running, then it will print an output as shown:
+-------------+---------------------+ | 101 | 2017-10-24 03:53:53 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 100 | 2017-10-24 03:53:54 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 102 | 2017-10-24 03:53:55 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 103 | 2017-10-24 03:53:57 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 103 | 2017-10-24 03:53:58 | +-------------+---------------------+ +-------------+---------------------+ | @@server_id | NOW() | +-------------+---------------------+ | 104 | 2017-10-24 03:53:59 | +-------------+---------------------+
You can also verify if the slaves have the same data set:
kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\ mysql -h mysql-3.mysql -e "SELECT * FROM test.messages"
It still shows the same result:
+--------------------------+ | message | +--------------------------+ | hello, from mysql-client | +--------------------------+
You can scale down by using the command shown:
kubectl scale statefulset mysql --replicas=3 statefulset "mysql" scaled
Note that, scale in doesn’t delete the data or PVCs attached to the pods. You have to delete them manually
kubectl delete pvc data-mysql-3 kubectl delete pvc data-mysql-4
It shows the output:
persistentvolumeclaim "data-mysql-3" deleted persistentvolumeclaim "data-mysql-4" deleted
First delete the StatefulSet. This also terminates the pods:
$ kubectl delete statefulset mysql statefulset "mysql" deleted
Verify there are no more pods running:
kubectl get pods -l app=mysql
It shows the output:
No resources found.
Delete ConfigMap, Service, PVC using the command:
$ kubectl delete configmap,service,pvc -l app=mysql configmap "mysql-config" deleted service "mysql" deleted service "mysql-read" deleted persistentvolumeclaim "data-mysql-0" deleted persistentvolumeclaim "data-mysql-1" deleted persistentvolumeclaim "data-mysql-2" deleted