-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: expose cloud storage usage accross the entire cluster #9305
Conversation
This commit adds a method to `cluster::partition` that exposes a new `cloud_log_size` method. It returns the sum of all *tracked* log segments that have beeb uploaded to cloud storage. A segment is tracked if it is present in the manifest and has not been slated for removal.
2290e3d
to
aac78e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job +1
e930af4
to
dde792d
Compare
Changes in force-push and force-push:
|
dde792d
to
ac4327f
Compare
Changes in force-push:
|
ac4327f
to
463bad1
Compare
Changes in force-push:
|
463bad1
to
768a01f
Compare
Changes in force-push:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A new controller RPC is introduced in this commit: `cloud_storage_usage`. The request takes a list of partitions broken down by shard, and the response contains the total number of bytes used by the cloud storage log of the specified partitions and the partitions that could not be found. This RPC is a building block for cluster wide map-reduce operation that computes the total size of all cloud logs.
This commit introduce a utility class that walks the topic table and generates batches of partitions and their current replicas. This operation only makes sense if the topic table is stable throughout. An exception is thrown if the topic table mutates between batches.
This commit introduce a utility class that performs a map-reduce operation across the cluster in order to determine the sum of the cloud log sizes for all partitions.
This commit introduces a new debug route: `/v1/debug/cloud_storage_usage` which returns the total number of bytes used by the cloud log for all partitions in the cluster. This route is only to be used for testing purposes.
768a01f
to
dfaa00c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet, LGTM
/backport v23.1.x |
Failed to run cherry-pick command. I executed the below command:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great
This PR introduces infrastructure that exposes the total cloud storage usage across
the cluster. The intention is to provide this as input for the billing service.
The central component here is the
cloud_storage_size_reducer
. It performs a map-reduceoperation over the cluster by iterating over the topic table in batches and performing the
following operations on each batch:
cloud_storage_usage
RPC requests for each node in the cluster. The request will contain the partitions being queried by shard.The semantics of the usage returned by
cloud_storage_size_reducer::reduce
are as follows:the sum of all segment sizes above the start offset in any node-local partition manifest.
"Above the start offset" is relevant because the returned usage can be ahead of the actual
size in the bucket. Retention in cloud storage is a two step process: the start offset is advanced
first, and then segments below the start offset are removed. The reason for this approach
is to avoid over-reporting if the delete requests fail.
Also note that the metadata stored in the cloud along with the actual segment files is not
included in the usage reporting. The amount of metadata in the cloud is very small when
compared to the actual user data (~1MiB per 3000 segments; 375GiB with the default cloud
segment size) and the ratio will become even smaller when the manifest encoding changes
for v23.2. This greatly simplifies the implementation.
TODO: Extend tests to include leadership changes.
Backports Required
Release Notes