Resource Constraints allow the rook components to be in specific Kubernetes Quality of Service (QoS) classes. For this the components started by rook need to have resource requests and/or limits set depending on which class the component(s) should be in.
Ceph has recommendations for CPU and memory for each component. The recommendations can be found here: http://docs.ceph.com/docs/master/start/hardware-recommendations/
If limits are set too low, this is especially a danger on the memory side. When a Pod runs out of memory because of the limit it is OOM killed, there is a potential of data loss. The CPU limits would merely limit the "throughput" of the component when reaching it's limit.
The resource constraints are defined in the rook Cluster, Filesystem and RGW CRDs.
The default is to not set resource requirements, this translates to the qosClasss: BestEffort
(qosClasss
will be later on explained in Automatic Algorithm).
The user is able to enable and disable the automatic resource algorithm as he wants.
This algorithm has a global "governance" class for the Quality of Service (QoS) class to be aimed for.
The key to choose the QoS class is named qosClasss
and in the resources
specification.
The QoS classes are:
BestEffort
- Norequests
andlimits
are set (for testing).Burstable
- Onlyrequests
requirements are set.Guaranteed
-requests
andlimits
are set to be equal.
Additionally to allow the user to simply tune up/down the values without needing to set them a key named multiplier
in the resources
specification. This value is used as a multiplier for the calculated values.
The OSDs are a special case that need to be covered by the algorithm. The algorithm needs to take the count of stores (devices and supplied directories) in account to calculate the correct amount of CPU and especially memory usage.
User defined values always overrule automatic calculated values by rook. The precedence of values is:
- User defined values
- (For OSD) per node
- Global values
Example: If you specify a global value for memory for OSDs but set a memory value for a specific node, every OSD except the specific OSD on the node, will get the global value set.
A Kubernetes resource requirement object looks like this:
requests:
cpu: “2”
memory: “1Gi”
limits:
cpu: “3”
memory: “2Gi”
The key in the CRDs to set resource requirements is named resources
.
The following components will allow a resource requirement to be set for:
api
agent
mgr
mon
osd
The mds
and rgw
components are configured through the CRD that creates them.
The mds
are created through the Filesystem
CRD and the rgw
through the ObjectStore
CRD.
There will be a resources
section added to their respective specification to allow user defined requests/limits.
To allow per node/OSD configuration of resource constraints, a key is added to the storage.nodes
item.
It is named resources
and contain a resource requirement object (see above).
The rook-ceph-agent
may be utilized in some way to detect how many stores are used.
The requests/limits configured are as an example and not to be used in production.
apiVersion: v1
kind: Namespace
metadata:
name: rook-ceph
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
hostNetwork: false
mon:
count: 3
allowMultiplePerNode: false
placement:
resources:
api:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
mgr:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
mon:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
osd:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
storage:
useAllNodes: true
useAllDevices: false
deviceFilter:
metadataDevice:
location:
storeConfig:
storeType: bluestore
nodes:
- name: "172.17.4.101"
directories:
- path: "/rook/storage-dir"
# resources for the OSD Pod
resources:
requests:
memory: "512Mi"
- name: "172.17.4.201"
devices:
- name: "sdb"
- name: "sdc"
storeConfig:
storeType: bluestore
# resources for the OSD Pod
resources:
requests:
memory: "512Mi"
cpu: "1"
limits:
cpu: "2"