-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-3081] kafka: add read_distribution
histogram
#18745
Conversation
260b1ea
to
9604f7a
Compare
9604f7a
to
70adf04
Compare
read_distribution
histogramread_distribution
histogram
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structurally looks pretty good, just a couple high level questions about the max timestamp.
Also it'd be great to include some tests (I think fixture tests might be able to directly access the partition probe? otherwise a single-node test in ducktape is fine)
@@ -199,6 +199,8 @@ class log_hist { | |||
*/ | |||
seastar::metrics::histogram internal_histogram_logform() const; | |||
|
|||
seastar::metrics::histogram read_dist_histogram_logform() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the public and internal histograms were introduced to be generic off-the-shelf reusables that metric authors could use without worrying about things like metric cardinality blowing grafana up. I'm wondering if they're reusable enough for our purposes here, or whether this read_dist_histogram should be made generic enough to guide others who may introduce metrics of data timestamp deltas
@ballard26 any thoughts here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_hist
is designed to be an approximation of a histogram. Take the type;
using log_hist_internal = log_hist<std::chrono::microseconds, 26, 8ul>;
there will be 26 buckets each of which is an unsigned integer. Each bucket will end up as another metric series. So 26 in this case or 18 in the public histogram case. for log_hist_internal
the buckets are laid out like;
[0, 8us), [8us, 16us), [16us, 32us), [32us, 64us)....[~1min, +inf)
when a duration is record the bucket whose range includes the duration is found. Then the int for that bucket is incremented.
From this we can see that the two existing resuables are great for recording latency since there is a lot of granularity around the 0 to 1s range. However, for this metric it seems we care a bit more about broader ranges like minutes or hours. So the existing resuables won't work as all those values will be recorded to the last bucket and the resulting histogram won't be very useful.
So to that ends I think what @WillemKauf has done here is fine. It gives up the granularity we have in the latency histograms around 0-1s in exchange for having more buckets over a longer range of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to make the comments on the resuable types a bit more instructive. I.e, this is the histogram you want to use if you only care about accurately representing sub-second durations, minute/hour durations, day durations... etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TDLR; the resuable types were only designed for sub-second latency durations. It makes sense to me to add more that more accurately represent other ranges of times folks may be interested in.
70adf04
to
bb8aa31
Compare
Added ducktape test |
bb8aa31
to
127922b
Compare
127922b
to
da437b1
Compare
/ci-repeat |
b91584f
to
e38be80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether we somehow want to weigh this by bytes?
Like this we don't differentiate between a 100 byte read and a 100KB read.
Maybe this makes no difference because both kinda have the same overhead when having to reach out to cloud and I guess the main thing why we are interested in this is for local retention reasons.
e38be80
to
a7688c4
Compare
For ease of accessing `is_duration_v` from elsewhere in the codebase.
a7688c4
to
582e518
Compare
Thanks for the extensive feedback @StephanDollberg.
|
read_distribution
histogramread_distribution
histogram
Add a new probe that will be used in the `kafka::server`. Contains the `log_hist_read_dist` object for recording the read distribution of data fetched, based on the timestamp delta between the data read and the current `now()` timestamp. Histogram metrics is aggregated across shards when cluster config `aggregate_metrics` is `true`.
582e518
to
3055a0e
Compare
Adds a
read_distribution
histogram and probe to thekafka::server
. This is a new internal metric which tracks the timestamp delta between data read by Kafka fetches infetch_ntps_in_parallel()
and the currentnow()
timestamp. This metric is aggregated across shards.A new histogram type
log_hist_read_dist
is added. The timestamp delta is measured in minutes using 16 buckets in the histogram, such that the bounds span from up to 3 minutes in the first bucket, to older than 91 days in the last bucket.There are no updates made to the Redpanda dashboard yet, so it is up to the user to configure a panel with this histogram if they are interested in seeing the statistics.
For Grafana visualization, I recommend:
vectorized_kafka_fetch_read_distribution_bucket
as the selected metric.{{le}}
as the legend, withHeatmap
as the format.Bar gauge
.The result should look like:
My configured panel
JSON
is:Backports Required
Release Notes
Features