Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-3081] kafka: add read_distribution histogram #18745

Merged
merged 8 commits into from
Jun 7, 2024

Conversation

WillemKauf
Copy link
Contributor

@WillemKauf WillemKauf commented May 31, 2024

Adds a read_distribution histogram and probe to the kafka::server. This is a new internal metric which tracks the timestamp delta between data read by Kafka fetches in fetch_ntps_in_parallel() and the current now() timestamp. This metric is aggregated across shards.

A new histogram type log_hist_read_dist is added. The timestamp delta is measured in minutes using 16 buckets in the histogram, such that the bounds span from up to 3 minutes in the first bucket, to older than 91 days in the last bucket.

There are no updates made to the Redpanda dashboard yet, so it is up to the user to configure a panel with this histogram if they are interested in seeing the statistics.

For Grafana visualization, I recommend:

  • Adding a new panel with vectorized_kafka_fetch_read_distribution_bucket as the selected metric.
  • In the "Options" drop-down, selecting {{le}} as the legend, with Heatmap as the format.
  • Setting the visualization type to Bar gauge.
  • Transforming the field names from minutes to something more human readable as the buckets increase in size.

The result should look like:

image

My configured panel JSON is:

{
  "id": 23763572047,
  "gridPos": {
    "x": 0,
    "y": 0,
    "w": 12,
    "h": 8
  },
  "type": "bargauge",
  "title": "Read Distribution",
  "targets": [
    {
      "datasource": {
        "uid": "fXEekgsSz",
        "type": "prometheus"
      },
      "refId": "A",
      "hide": false,
      "editorMode": "code",
      "expr": "vectorized_kafka_fetch_read_distribution_bucket",
      "legendFormat": "{{le}}",
      "range": true,
      "format": "heatmap"
    }
  ],
  "options": {
    "reduceOptions": {
      "values": false,
      "calcs": [
        "lastNotNull"
      ],
      "fields": ""
    },
    "orientation": "auto",
    "displayMode": "gradient",
    "showUnfilled": true,
    "minVizWidth": 0,
    "minVizHeight": 10
  },
  "fieldConfig": {
    "defaults": {
      "mappings": [],
      "thresholds": {
        "mode": "absolute",
        "steps": [
          {
            "value": null,
            "color": "green"
          },
          {
            "value": 80,
            "color": "red"
          }
        ]
      },
      "color": {
        "mode": "thresholds"
      }
    },
    "overrides": []
  },
  "datasource": {
    "uid": "fXEekgsSz",
    "type": "prometheus"
  },
  "pluginVersion": "9.2.10",
  "transformations": [
    {
      "id": "organize",
      "options": {
        "excludeByName": {
          "Time": false
        },
        "indexByName": {},
        "renameByName": {
          "3.000000": "3m",
          "7.000000": "7m",
          "15.000000": "15m",
          "31.000000": "31m",
          "63.000000": "1h",
          "127.000000": "2h",
          "255.000000": "4h",
          "511.000000": "8.5h",
          "1023.000000": "17h",
          "2047.000000": "34h",
          "4095.000000": "68h",
          "8191.000000": "6d",
          "16383.000000": "11d",
          "32767.000000": "22d",
          "65535.000000": "45d",
          "131071.000000": "91d",
          "+Inf": ">91d"
        }
      }
    }
  ]
}

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Features

  • Adds a read distribution histogram to the internal metrics that can be visualized using Prometheus and Grafana:
    • vectorized_kafka_fetch_read_distribution_bucket
    • vectorized_kafka_fetch_read_distribution_count
    • vectorized_kafka_fetch_read_distribution_sum

@github-actions github-actions bot added area/redpanda area/wasm WASM Data Transforms labels May 31, 2024
@WillemKauf WillemKauf requested review from ballard26 and andrwng May 31, 2024 20:30
@WillemKauf WillemKauf marked this pull request as ready for review June 3, 2024 17:51
@WillemKauf WillemKauf changed the title metrics: add read_distribution histogram [CORE-3081] metrics: add read_distribution histogram Jun 3, 2024
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structurally looks pretty good, just a couple high level questions about the max timestamp.

Also it'd be great to include some tests (I think fixture tests might be able to directly access the partition probe? otherwise a single-node test in ducktape is fine)

src/v/kafka/server/handlers/fetch.cc Outdated Show resolved Hide resolved
src/v/cluster/partition_probe.h Outdated Show resolved Hide resolved
src/v/kafka/server/handlers/fetch.cc Outdated Show resolved Hide resolved
src/v/utils/log_hist.h Show resolved Hide resolved
@@ -199,6 +199,8 @@ class log_hist {
*/
seastar::metrics::histogram internal_histogram_logform() const;

seastar::metrics::histogram read_dist_histogram_logform() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the public and internal histograms were introduced to be generic off-the-shelf reusables that metric authors could use without worrying about things like metric cardinality blowing grafana up. I'm wondering if they're reusable enough for our purposes here, or whether this read_dist_histogram should be made generic enough to guide others who may introduce metrics of data timestamp deltas

@ballard26 any thoughts here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_hist is designed to be an approximation of a histogram. Take the type;

using log_hist_internal = log_hist<std::chrono::microseconds, 26, 8ul>;

there will be 26 buckets each of which is an unsigned integer. Each bucket will end up as another metric series. So 26 in this case or 18 in the public histogram case. for log_hist_internal the buckets are laid out like;

[0, 8us), [8us, 16us), [16us, 32us), [32us, 64us)....[~1min, +inf)

when a duration is record the bucket whose range includes the duration is found. Then the int for that bucket is incremented.

From this we can see that the two existing resuables are great for recording latency since there is a lot of granularity around the 0 to 1s range. However, for this metric it seems we care a bit more about broader ranges like minutes or hours. So the existing resuables won't work as all those values will be recorded to the last bucket and the resulting histogram won't be very useful.

So to that ends I think what @WillemKauf has done here is fine. It gives up the granularity we have in the latency histograms around 0-1s in exchange for having more buckets over a longer range of time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to make the comments on the resuable types a bit more instructive. I.e, this is the histogram you want to use if you only care about accurately representing sub-second durations, minute/hour durations, day durations... etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TDLR; the resuable types were only designed for sub-second latency durations. It makes sense to me to add more that more accurately represent other ranges of times folks may be interested in.

src/v/kafka/server/replicated_partition.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the area/wasm WASM Data Transforms label Jun 4, 2024
@WillemKauf
Copy link
Contributor Author

WillemKauf commented Jun 4, 2024

Structurally looks pretty good, just a couple high level questions about the max timestamp.

Also it'd be great to include some tests (I think fixture tests might be able to directly access the partition probe? otherwise a single-node test in ducktape is fine)

Added ducktape test test_read_distribution_metric, removed max_timestamp() related functions, and we now compare the Kafka fetch timestamp against model::timestamp::now() for a more real-time age of data reads.

@WillemKauf
Copy link
Contributor Author

/ci-repeat

@WillemKauf WillemKauf force-pushed the read_histogram branch 3 times, most recently from b91584f to e38be80 Compare June 5, 2024 17:41
Copy link
Member

@StephanDollberg StephanDollberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether we somehow want to weigh this by bytes?

Like this we don't differentiate between a 100 byte read and a 100KB read.

Maybe this makes no difference because both kinda have the same overhead when having to reach out to cloud and I guess the main thing why we are interested in this is for local retention reasons.

src/v/cluster/partition_probe.cc Outdated Show resolved Hide resolved
@WillemKauf
Copy link
Contributor Author

WillemKauf commented Jun 6, 2024

Thanks for the extensive feedback @StephanDollberg.

  • Reworked the probe into the kafka::server object
  • The metric is no longer associated with any NTP label or aggregation, but is still aggregated across all shards (by default).

@WillemKauf WillemKauf changed the title [CORE-3081] metrics: add read_distribution histogram [CORE-3081] kafka: add read_distribution histogram Jun 6, 2024
src/v/kafka/read_distribution_probe.h Outdated Show resolved Hide resolved
src/v/kafka/read_distribution_probe.h Outdated Show resolved Hide resolved
Add a new probe that will be used in the `kafka::server`.
Contains the `log_hist_read_dist` object for recording the
read distribution of data fetched, based on the timestamp delta
between the data read and the current `now()` timestamp.

Histogram metrics is aggregated across shards when cluster config
`aggregate_metrics` is `true`.
@WillemKauf WillemKauf merged commit 18fa688 into redpanda-data:dev Jun 7, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants