librbd QoS settings for RBD based PVs #521

mmgaggle · 2019-08-02T21:28:13Z

Describe the feature you'd like to have

The ability to set librbd QoS settings on a PV to limit how much IO can be consumed from the Ceph Cluster.

The exactly limits would be informed through the storage-class configuration. Ideally we would support three different types of limits:

static rbd_qos_iops_limit and rbd_qos_bps_limit per volume
dynamic rbd_qos_iops_limit and rbd_qos_bps_limit per volume as a function of the PV size (eg. 3 IOPS per GB, 100 MB/s per TB with a configurable rbd_qos_schedule_tick_min.

A PVC could specify the number of IOPs from storage classes of the second type, but it would adjust the capacity requested based on the above ratio configured in the storage class definition.

What is the value to the end user?

Many users were frustrated by IO noisy neighbor issues in early Ceph deployments that were catering to OpenStack environments. Folks started to implement QEMU throttling at the virtio-blk/scsi and this became much more manageable. Capacity based IOPs further improved on the situation by providing familiar a public cloud like experience (vs static per volume limits).

We want Kubernetes and OpenShift users to have improved noisy neighbor isolation too!

How will we know we have a good solution?

Configure ceph-csi to use nbd-rbd approach.
Provision volume from storage class as configured above.
CSI provisioner would set limit on RBD image.
fio test against PV would confirm that IOPs limits were being enforced.

Once resize work is finished, we'll need to ensure new limits are applied when a volume is re-sized.

fire · 2019-12-15T20:07:55Z

Is anyone working on this? What is the status of this?

dillaman · 2019-12-15T23:35:07Z

I don't believe anyone is working on it -- and it really doesn't make much sense since rbd-nbd isn't really production worthy (right now), so we wouldn't want to encourage even more folks using it.

The best longer term solution would be to ensure cgroups v2 is being utilized on the k8s node so that generic block rate limiting controls can be applied (which would handle krbd both rbd-nbd). I'm not sure of the realistic timeline for cgroups v2 integration in k8s (it just became the default under Fedora 31).

fire · 2019-12-16T01:08:59Z

Do you know where cgroups v2 integration in k8s for limiting block rates is tracked?

dillaman · 2019-12-16T01:20:22Z

This [1] provides a really good overview and a theoretical timeline

[1] https://medium.com/nttlabs/cgroup-v2-596d035be4d7

mmgaggle · 2020-01-20T23:20:11Z

The bigger problem with cgroups is that they only provide independent limits for read and write, compared with libdrbd/virtio which provide the ability to express limits against the aggregate of reads and writes.

If I want to limit a given PV to 100 IOPs, I can't do that with cgroups. I can only set a write IOPs limit to (50, 30, 10) and another distinct limit on read IOPS (50, 70, 90). A client can't trade a write IO for a read IO, or vis versa.

mmgaggle · 2020-04-02T06:17:10Z

Probably would need something like this KEP to do accounting -

kubernetes/enhancements#1353

Basically if you know a cluster can provide 100k iops and 100TB, then you need to add up the PVs (qty * static limit, or capacity * ratio limit) to make sure you're not oversubscribed.

stale · 2020-10-04T07:17:17Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

stale · 2020-10-12T07:33:46Z

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

matti · 2021-01-23T09:05:25Z

Why isn't this a top priority issue? a rogue pod can destroy the ceph cluster

matti · 2021-01-23T09:12:12Z

"The noisy neighbor problem". Without this feature Rook won't be useable in production, as you can slow down the whole cluster by e.g extracting a huge gzip file.

(rook/rook#1499 (comment))

mykaul · 2021-01-24T10:07:08Z

Why isn't this a top priority issue? a rogue pod can destroy the ceph cluster

@matti - do you have any code for this you'd like to share with the community? That would be most welcome and will certainly help prioritizing this work.

pre · 2021-01-25T08:14:50Z

do you have any code for this you'd like to share with the community?

Are you requesting example code for demonstrating the problem or the solution?

The example code for demonstrating the problem is very simple. It's enough to unpack a huge gzip file or run dd. You just need to run them parallel from different nodes so that these rogue clients overwhelm the Ceph cluster.

TL;DR Any centralized limit for IOPS (eg. 3 IOPS per GB, 100 MB/s per TB) would be needed. The limit needs to be on the top so that you can't workaround the limit by having enough rogue clients in parallel.

mykaul · 2021-01-25T13:13:16Z

do you have any code for this you'd like to share with the community?

Are you requesting example code for demonstrating the problem or the solution?

The solution. I'm well aware of the issue.

michaelgeorgeattard · 2021-09-07T13:40:53Z

Hi, are there any plans or updates on this topic please?

knfoo · 2021-09-30T13:03:42Z

I am also very interested in this feature 👍

HaveFun83 · 2022-12-08T14:36:41Z

Great feature. Any news here?

matti · 2022-12-08T14:39:06Z

no news, rogue pod can still destroy the entire ceph cluster

m-yosefpor · 2022-12-08T16:03:03Z

some container runtimes (such as cri-o) support iops
and BW limitation on pods. You need to add some annotation to pods (with a policy engine like kyverno or custom webhooks) to ensure pods are limited. see here fore more info: cri-o/cri-o#4873

matti · 2022-12-08T17:46:09Z

but bandwidth limit ensures that performance is always limited, eg. always bad.

This issue is about Quality of Service where pods would be allowed to burst to maximum while still preventing exhaustion.

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: #521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 mentioned this issue Nov 26, 2019

Limit IOPS per volume via Storageclass rook/rook#1499

Closed

nixpanic added the component/rbd Issues related to RBD label Apr 17, 2020

stale bot added the wontfix This will not be worked on label Oct 4, 2020

stale bot closed this as completed Oct 12, 2020

Madhu-1 reopened this Nov 4, 2020

stale bot removed the wontfix This will not be worked on label Nov 4, 2020

Madhu-1 added the keepalive This label can be used to disable stale bot activiity in the repo label Nov 4, 2020

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

8edeadf

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

d3b4ace

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 mentioned this issue Aug 29, 2023

doc: add design doc for RBD QoS #4089

Merged

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

1146486

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

083a973

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

544ebb1

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QOS

6c3462a

Add design doc for QOS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QoS

a01bdf2

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Aug 29, 2023

doc: add design doc for QoS

5960e10

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 5, 2023

doc: add design doc for QoS

7da0a8c

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 5, 2023

doc: add design doc for QoS

7bdff7d

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 5, 2023

doc: add design doc for QoS

e2095bc

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 5, 2023

doc: add design doc for QoS

eebdf42

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 6, 2023

doc: add design doc for QoS

2c139cb

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 6, 2023

doc: add design doc for QoS

b564d35

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Sep 7, 2023

doc: add design doc for QoS

b74ba28

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: ceph#521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

mergify bot closed this as completed in #4089 Sep 7, 2023

mergify bot pushed a commit that referenced this issue Sep 7, 2023

doc: add design doc for QoS

7c0ce53

Add design doc for QoS for rbd devices mapped with both krbd and rbd-nbd closes: #521 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librbd QoS settings for RBD based PVs #521

librbd QoS settings for RBD based PVs #521

mmgaggle commented Aug 2, 2019 •

edited

Loading

fire commented Dec 15, 2019

dillaman commented Dec 15, 2019

fire commented Dec 16, 2019

dillaman commented Dec 16, 2019

mmgaggle commented Jan 20, 2020

mmgaggle commented Apr 2, 2020

stale bot commented Oct 4, 2020

stale bot commented Oct 12, 2020

matti commented Jan 23, 2021

matti commented Jan 23, 2021

mykaul commented Jan 24, 2021

pre commented Jan 25, 2021

mykaul commented Jan 25, 2021

michaelgeorgeattard commented Sep 7, 2021

knfoo commented Sep 30, 2021

HaveFun83 commented Dec 8, 2022

matti commented Dec 8, 2022

m-yosefpor commented Dec 8, 2022

matti commented Dec 8, 2022

librbd QoS settings for RBD based PVs #521

librbd QoS settings for RBD based PVs #521

Comments

mmgaggle commented Aug 2, 2019 • edited Loading

Describe the feature you'd like to have

What is the value to the end user?

How will we know we have a good solution?

fire commented Dec 15, 2019

dillaman commented Dec 15, 2019

fire commented Dec 16, 2019

dillaman commented Dec 16, 2019

mmgaggle commented Jan 20, 2020

mmgaggle commented Apr 2, 2020

stale bot commented Oct 4, 2020

stale bot commented Oct 12, 2020

matti commented Jan 23, 2021

matti commented Jan 23, 2021

mykaul commented Jan 24, 2021

pre commented Jan 25, 2021

mykaul commented Jan 25, 2021

michaelgeorgeattard commented Sep 7, 2021

knfoo commented Sep 30, 2021

HaveFun83 commented Dec 8, 2022

matti commented Dec 8, 2022

m-yosefpor commented Dec 8, 2022

matti commented Dec 8, 2022

mmgaggle commented Aug 2, 2019 •

edited

Loading