-
Notifications
You must be signed in to change notification settings - Fork 354
Kubearmor Performance Benchmarking Guide
This Methodology can scale up to 100 Node Cluster but for now, to get the idea of how this works we will keep it confined to 4 Node Cluster. The below tests and configurations are being performed on GKE, but the procedure follows similar steps on any cloud platform.
-
First of all, clone this GCloud microservices-demo repo which will act as a workload cluster in our benchmarking.
-
Navigate to
src/loadgenerator/Dockerfile
-
Replace
ENTRYPOINT locust --host="http://${FRONTEND_ADDR}" --headless -u "${USERS:-10}" 2>&1
with
ENTRYPOINT locust --host="http://${FRONTEND_ADDR}" -u "${USERS:-10}" 2>&1
- Create the docker image out of it and push it to dockerhub.
docker build --tag <dockerhub-user-name>/locust .
docker push <dockerhub-user-name>/locust
-
Now navigate to
kubernetes-manifests/release/kubernetes-manifests.yaml
file. We need to make some modifications to this manifest file to make it work according to our use case. -
Change the image of the loadgenerator to the one you created just above:
for ex:
-
Remove any resource limits in the loadgenerator as in the large cluster it may crash while generating large loads.
-
Add
toleration
along withnodeSelector
so as to dedicate one separate node with high-mem-cpu.Below is the example to add toleration along with nodeSelector:
- Now create a 4 node cluster on any cloud platform out of which 1 node should have high cpu and memory specs, and apply taint to this node.
kubectl taint nodes <node-name> color:blue:NoSchedule
- Now apply the manifest file that you just edited.
kubectl apply -f microservices-demo/release/kubernetes-manifests.yaml
- Now apply the hpa to manage and autoscale the pods according to the incoming load.
kubectl autoscale deployment cartservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment currencyservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment emailservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment checkoutservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment frontend --cpu-percent=50 --min=5 --max=400
kubectl autoscale deployment paymentservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment productcatalogservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment recommendationservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment redis-cart --cpu-percent=50 --min=1 --max=400
kubectl autoscale deployment shippingservice --cpu-percent=50 --min=2 --max=400
kubectl autoscale deployment adservice --cpu-percent=50 --min=1 --max=400
- Now, we need to expose the loadgenerator ui to access it, for this we need to create a service to expose the webui to external traffic.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: loadgenerator-service
spec:
selector:
app: loadgenerator
ports:
- protocol: TCP
port: 8089
targetPort: 8089
type: NodePort
nodePort: 30001
Now just copy the ip of one of the cluster node and paste it in the web browser followed by the NodePort i.e 30001, you should now be able to access the locust webui.
-
Now, we need to deploy the monitoring stack i.e Prometheus so that we can capture the CPU and Memory Usage of the microservices in real-time.
-
First of all deploy ksm (kube-state-metrics) so that we can capture the perf logs.
git clone https://github.com/kubernetes/kube-state-metrics.git
kubectl apply -f kube-state-metrics/examples/standard
- Now deploy the prometheus components one by one.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
EOF
cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: default
namespace: monitoring
EOF
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: monitoring
data:
prometheus.rules: |-
groups:
- name: sample alert
rules:
- alert: High Pod Memory
expr: sum(container_memory_usage_bytes) > 1
for: 1m
labels:
severity: slack
annotations:
summary: High Memory Usage
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
EOF
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: monitoring
labels:
app: prometheus-server
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
ports:
- containerPort: 9090
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-server-conf
- name: prometheus-storage-volume
emptyDir: {}
EOF
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: prometheus-service
namespace: monitoring
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9090'
spec:
selector:
app: prometheus-server
type: NodePort
ports:
- port: 8080
targetPort: 9090
nodePort: 30000
EOF
Now just copy the ip of one of the cluster node and paste it in the web browser followed by the NodePort i.e 30000, you should now be able to access the prometheus webui.
This is the table we need
Scenario | Users | Kubearmor CPU | Kubearmor Relay CPU (m) | Throughput (req/s) | Failed Requests | Micro-service CPU (Frontend) | Micro-service CPU (CartService) | Micro-service CPU (CurrencyService) | Kubearmor Memory | Kubearmor Relay Memory |
---|---|---|---|---|---|---|---|---|---|---|
without kubearmor | 2000 | - | - | 373.3 | 0 | 1124.6m (21 replica) | 345.17m (3 replica) | 994.6m (23 replica) | - | - |
To get the CPU and memory usage of each of the microservices you need to write the promql in the prometheus panel.
Below is the generic query to get the CPU usage of microservices:
sum(rate(container_cpu_usage_seconds_total{pod=~"<deployment-name>-.*", container = "", namespace="<deployment-namespace>"}[1m]))
Below is the generic query to get the memory usage of microservices:
sum(container_memory_usage_bytes{pod=~"<deployment-name>-.*", namespace="<deployment-namespace>"})
To get the throughput you need to wait for the users to reach their upper limit and then wait for 10-15 min and then take the average of the 5 instances of throughput. Just like below image:
Then take the 5 values in the stable region of the corresponding users/s graph:
You can deploy the Kubearmor in several ways one is as stated here
or
You can install karmor-cli and then can install the kubearmor via karmor-cli.
curl -sfL http://get.kubearmor.io/ | sudo sh -s -- -b /usr/local/bin
karmor install
After installing Kubearmor we want the relay server to be located at the separate node so that we can get the crisp performace metrics of the karmor-relay. To do so, we will deploy the relay in the same node where our loadgenerator exists by applying the toleration and nodeSelector in the relay deploy:
kubectl edit deploy kubearmor-relay -n kubearmor
paste the below config in the spec section under container spec:
tolerations:
- key: color
operator: Equal
value: blue
effect: NoSchedule
nodeSelector:
nodetype: node1
after then press esc
key and type :wq
to save your changes, after a while you'll see that your relay is being deployed on the same node where your loadgenerator exists.
You can check so by following command:
kubectl get pods -n kubearmor -o wide