-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
librbd QoS settings for RBD based PVs #521
Comments
Is anyone working on this? What is the status of this? |
I don't believe anyone is working on it -- and it really doesn't make much sense since The best longer term solution would be to ensure cgroups v2 is being utilized on the k8s node so that generic block rate limiting controls can be applied (which would handle krbd both rbd-nbd). I'm not sure of the realistic timeline for cgroups v2 integration in k8s (it just became the default under Fedora 31). |
Do you know where cgroups v2 integration in k8s for limiting block rates is tracked? |
This [1] provides a really good overview and a theoretical timeline |
The bigger problem with cgroups is that they only provide independent limits for read and write, compared with libdrbd/virtio which provide the ability to express limits against the aggregate of reads and writes. If I want to limit a given PV to 100 IOPs, I can't do that with cgroups. I can only set a write IOPs limit to (50, 30, 10) and another distinct limit on read IOPS (50, 70, 90). A client can't trade a write IO for a read IO, or vis versa. |
Probably would need something like this KEP to do accounting - Basically if you know a cluster can provide 100k iops and 100TB, then you need to add up the PVs (qty * static limit, or capacity * ratio limit) to make sure you're not oversubscribed. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation. |
Why isn't this a top priority issue? a rogue pod can destroy the ceph cluster |
|
@matti - do you have any code for this you'd like to share with the community? That would be most welcome and will certainly help prioritizing this work. |
Are you requesting example code for demonstrating the problem or the solution? The example code for demonstrating the problem is very simple. It's enough to unpack a huge gzip file or run TL;DR Any centralized limit for IOPS (eg. 3 IOPS per GB, 100 MB/s per TB) would be needed. The limit needs to be on the top so that you can't workaround the limit by having enough rogue clients in parallel. |
The solution. I'm well aware of the issue. |
Hi, are there any plans or updates on this topic please? |
I am also very interested in this feature 👍 |
Great feature. Any news here? |
no news, rogue pod can still destroy the entire ceph cluster |
some container runtimes (such as cri-o) support iops |
but bandwidth limit ensures that performance is always limited, eg. always bad. This issue is about Quality of Service where pods would be allowed to burst to maximum while still preventing exhaustion. |
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: #521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Describe the feature you'd like to have
The ability to set librbd QoS settings on a PV to limit how much IO can be consumed from the Ceph Cluster.
The exactly limits would be informed through the storage-class configuration. Ideally we would support three different types of limits:
A PVC could specify the number of IOPs from storage classes of the second type, but it would adjust the capacity requested based on the above ratio configured in the storage class definition.
What is the value to the end user?
Many users were frustrated by IO noisy neighbor issues in early Ceph deployments that were catering to OpenStack environments. Folks started to implement QEMU throttling at the virtio-blk/scsi and this became much more manageable. Capacity based IOPs further improved on the situation by providing familiar a public cloud like experience (vs static per volume limits).
We want Kubernetes and OpenShift users to have improved noisy neighbor isolation too!
How will we know we have a good solution?
Once resize work is finished, we'll need to ensure new limits are applied when a volume is re-sized.
The text was updated successfully, but these errors were encountered: