From b64e4c0eeb5221441a210428f5fb6e7de5799104 Mon Sep 17 00:00:00 2001 From: Kat Batuigas <36839689+kbatuigas@users.noreply.github.com> Date: Mon, 29 Jul 2024 10:41:51 -0400 Subject: [PATCH] Updates to RP self test (#604) Co-authored-by: Paulo Borges Co-authored-by: Jake Cahill <45230295+JakeSCahill@users.noreply.github.com> Co-authored-by: Michele Cyran Co-authored-by: Joyce Fee <102751339+Feediver1@users.noreply.github.com> Co-authored-by: Mike Boquard Co-authored-by: tris0laris <57298792+tris0laris@users.noreply.github.com> Co-authored-by: Dave Voutila Co-authored-by: Angela Simms <102690377+asimms41@users.noreply.github.com> Co-authored-by: Andrew Hsu Co-authored-by: Oren Leiman --- modules/get-started/pages/whats-new.adoc | 10 + .../cluster-diagnostics.adoc | 243 +---------- .../rpk-cluster-self-test-start.adoc | 8 +- .../rpk-cluster-self-test-status.adoc | 389 +----------------- .../partials/rpk-self-test-cloud-tests.adoc | 6 + .../partials/rpk-self-test-descriptions.adoc | 15 + .../partials/rpk-self-test-status-output.adoc | 216 ++++++++++ 7 files changed, 273 insertions(+), 614 deletions(-) create mode 100644 modules/reference/partials/rpk-self-test-cloud-tests.adoc create mode 100644 modules/reference/partials/rpk-self-test-descriptions.adoc create mode 100644 modules/reference/partials/rpk-self-test-status-output.adoc diff --git a/modules/get-started/pages/whats-new.adoc b/modules/get-started/pages/whats-new.adoc index 00ce137ec..f26286a94 100644 --- a/modules/get-started/pages/whats-new.adoc +++ b/modules/get-started/pages/whats-new.adoc @@ -40,6 +40,16 @@ Redpanda now includes `rpk` and Redpanda Console support for managing xref:manag Client throughput quotas, previously applied on a per-shard basis, now apply on a per-broker basis. Cluster configuration properties for managing client quotas are xref:upgrade:deprecated/index.adoc[deprecated], including `target_quota_byte_rate` which is disabled by default with the value `0`. +== Self-test enhancements + +New tests are added to the xref:manage:cluster-maintenance/cluster-diagnostics.adoc[Redpanda self-test] suite: + +* Cloud storage tests to validate xref:manage:tiered-storage.adoc[Tiered Storage] configuration. +* 16K block size disk tests to better asses block storage performance, particularly in response to I/O depth changes. +* 4K block size disk test with dsync off to asses the impact of fdatasync on the storage layer. + +See the xref:reference:rpk/rpk-cluster/rpk-cluster-self-test-status.adoc[`rpk self test`] reference for usage and output examples. + == Next steps * xref:install-beta.adoc[] diff --git a/modules/manage/pages/cluster-maintenance/cluster-diagnostics.adoc b/modules/manage/pages/cluster-maintenance/cluster-diagnostics.adoc index 7383216e1..45b76d81c 100644 --- a/modules/manage/pages/cluster-maintenance/cluster-diagnostics.adoc +++ b/modules/manage/pages/cluster-maintenance/cluster-diagnostics.adoc @@ -11,7 +11,19 @@ When anomalous behavior arises in a cluster and you're trying to figure out whet Self-test runs a set of benchmarks to determine the maximum performance of a machine's disks and network connections. For disks, it runs throughput and latency tests by performing concurrent sequential operations. For networks, it selects unique pairs of Redpanda nodes as client/server pairs, then it runs throughput tests between them. Self-test runs each benchmark for a configurable duration, and it returns IOPS, throughput, and latency metrics. -=== Self-test command examples +== Cloud storage tests + +If you use xref:manage:tiered-storage.adoc[Tiered Storage], run self-test to verify that you have configured your cloud storage accounts correctly. + +Self-test performs the following tests to validate cloud storage configuration: + +include::reference:partial$rpk-self-test-cloud-tests.adoc[] + +See the xref:reference:rpk/rpk-cluster/rpk-cluster-self-test-start.adoc[`rpk cluster self-test start`] reference for cloud storage test details. + +== Self-test command examples + +=== Start self-test To begin using self-test, run the `self-test start` command. @@ -34,6 +46,8 @@ rpk cluster self-test status The `self-test start` command returns immediately, and self-test runs its benchmarks asynchronously. +=== Check self-test status + To check on the status of self-test, run the `self-test status` command. [,bash] @@ -66,231 +80,12 @@ rpk cluster self-test status --format=json If benchmarks have completed, `self-test status` returns their results. +include::reference:partial$rpk-self-test-descriptions.adoc[] + .Example status output: test results -[%collapsible] -==== -Test results are grouped by node ID. Each test returns the following: - -- **NAME**: Description of the test. -- **INFO**: Detail about the test run attached by Redpanda itself. -- **TYPE**: Either `disk` or `network` test. -- **TEST ID**: Unique identifier given to jobs of a run. All IDs in a test should match. If they don't match, then newer and/or older test results have been included erroneously. -- **TIMEOUTS**: Number of timeouts incurred during the test. -- **DURATION**: Duration of the test. -- **IOPS**: Number of operations per second. For disk, it's `seastar::dma_read` and `seastar::dma_write`. For network, it's `rpc.send()` -- **THROUGHPUT**: For disk, it's throughput rate in bytes per second. For network, it's throughput rate in bits per second in. (Note: GiB vs. Gib is the correct notation displayed by the UI.) -- **LATENCY**: 50th, 90th, etc. percentiles of operation latency, reported in microseconds. - -``` -$ rpk cluster self-test status -NODE ID: 1 | STATUS: IDLE -========================= -NAME 512K sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5001ms -IOPS 1590 req/sec -THROUGHPUT 795.2MiB/sec -LATENCY P50 P90 P99 P999 MAX - 831us 5887us 11263us 24575us 507903us - -NAME 512K sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5001ms -IOPS 4504 req/sec -THROUGHPUT 2.2GiB/sec -LATENCY P50 P90 P99 P999 MAX - 703us 1599us 4351us 6399us 10239us - -NAME 4k sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5031ms -IOPS 289 req/sec -THROUGHPUT 144.7MiB/sec -LATENCY P50 P90 P99 P999 MAX - 543us 34815us 69631us 77823us 77823us - -NAME 4k sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 8275 req/sec -THROUGHPUT 4.041GiB/sec -LATENCY P50 P90 P99 P999 MAX - 191us 447us 831us 2175us 278527us - -NAME 8K Network Throughput Test -INFO Test performed against node: 0 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 61254 req/sec -THROUGHPUT 3.74Gib/sec -LATENCY P50 P90 P99 P999 MAX - 159us 207us 303us 415us 1087us - -NAME 8K Network Throughput Test -INFO Test performed against node: 2 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 54814 req/sec -THROUGHPUT 3.35Gib/sec -LATENCY P50 P90 P99 P999 MAX - 167us 255us 367us 511us 25599us - -NODE ID: 0 | STATUS: IDLE -========================= -NAME 512K sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5002ms -IOPS 1593 req/sec -THROUGHPUT 796.8MiB/sec -LATENCY P50 P90 P99 P999 MAX - 735us 5887us 11263us 69631us 507903us - -NAME 512K sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 4372 req/sec -THROUGHPUT 2.135GiB/sec -LATENCY P50 P90 P99 P999 MAX - 735us 1599us 4351us 7423us 9215us - -NAME 4k sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5026ms -IOPS 286 req/sec -THROUGHPUT 143.1MiB/sec -LATENCY P50 P90 P99 P999 MAX - 543us 34815us 69631us 77823us 77823us - -NAME 4k sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 8269 req/sec -THROUGHPUT 4.038GiB/sec -LATENCY P50 P90 P99 P999 MAX - 191us 447us 831us 2175us 278527us - -NAME 8K Network Throughput Test -INFO Test performed against node: 1 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 61612 req/sec -THROUGHPUT 3.76Gib/sec -LATENCY P50 P90 P99 P999 MAX - 159us 207us 303us 431us 1151us - -NAME 8K Network Throughput Test -INFO Test performed against node: 2 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 60306 req/sec -THROUGHPUT 3.68Gib/sec -LATENCY P50 P90 P99 P999 MAX - 159us 215us 351us 495us 11263us - -NODE ID: 2 | STATUS: IDLE -========================= -NAME 512K sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5001ms -IOPS 1580 req/sec -THROUGHPUT 790MiB/sec -LATENCY P50 P90 P99 P999 MAX - 671us 5887us 12287us 47103us 507903us - -NAME 512K sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 3932 req/sec -THROUGHPUT 1.92GiB/sec -LATENCY P50 P90 P99 P999 MAX - 831us 1791us 4351us 7167us 9215us - -NAME 4k sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5027ms -IOPS 280 req/sec -THROUGHPUT 140.1MiB/sec -LATENCY P50 P90 P99 P999 MAX - 575us 34815us 73727us 86015us 86015us - -NAME 4k sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 8699 req/sec -THROUGHPUT 4.248GiB/sec -LATENCY P50 P90 P99 P999 MAX - 183us 367us 831us 2175us 278527us - -NAME 8K Network Throughput Test -INFO Test performed against node: 0 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 60027 req/sec -THROUGHPUT 3.66Gib/sec -LATENCY P50 P90 P99 P999 MAX - 159us 223us 351us 511us 11775us - -NAME 8K Network Throughput Test -INFO Test performed against node: 1 -TYPE network -TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d -TIMEOUTS 0 -DURATION 5000ms -IOPS 63090 req/sec -THROUGHPUT 3.85Gib/sec -LATENCY P50 P90 P99 P999 MAX - 151us 207us 319us 463us 17407us - -``` -==== +include::reference:partial$rpk-self-test-status-output.adoc[] -NOTE: If self-test returns write results that are unexpectedly and significantly lower than read results, it may be because the Redpanda `rpk` client hardcodes the `DSync` option to `true`. When `DSync` is enabled, files are opened with the `O_DSYNC` flag set, and this represents the actual setting that Redpanda uses when it writes to disk. +=== Stop self-test To stop a running self-test, run the `self-test stop` command. diff --git a/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-start.adoc b/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-start.adoc index e54dc9f56..987c83424 100644 --- a/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-start.adoc +++ b/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-start.adoc @@ -19,10 +19,10 @@ Available tests to run: * *Cloud storage tests* ** Latency test: 1024-bit object. ** Depending on cluster read/write permissions (xref:reference:properties/object-storage-properties.adoc#cloud_storage_enable_remote_read[`cloud_storage_enable_remote_read`], xref:reference:properties/object-storage-properties.adoc#cloud_storage_enable_remote_write[`cloud_storage_enable_remote_write`]), a series of cloud storage operations are performed: -*** Upload an object to an object storage. -*** List objects in the object storage. -*** Download an object from the object storage. -*** Delete the original object from the object storage, if it was uploaded. ++ +-- +include::reference:partial$rpk-self-test-cloud-tests.adoc[] +-- This command prompts users for confirmation (unless the flag `--no-confirm` is specified), then returns a test identifier ID, and runs the tests. diff --git a/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-status.adoc b/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-status.adoc index 9ba5f2015..5163d6084 100644 --- a/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-status.adoc +++ b/modules/reference/pages/rpk/rpk-cluster/rpk-cluster-self-test-status.adoc @@ -17,21 +17,7 @@ Node 1 is still running net self test * No jobs running ** Returns the cached results for all brokers of the last completed test. -Test results are grouped by broker ID. Each test returns the following: - -* *Name*: Description of the test. -* *Info*: Details about the test run attached by Redpanda. -* *Type*: Either `disk`, `network`, or `cloud` test. -* *Test Id*: Unique identifier given to jobs of a run. All IDs in a test should match. If they don't match, then newer and/or older test results have been included erroneously. -* *Timeouts*: Number of timeouts incurred during the test. -* *Start time*: Time that the test started, in UTC. -* *End time*: Time that the test ended, in UTC. -* *Avg Duration*: Duration of the test. -* *IOPS*: Number of operations per second. For disk, it's `seastar::dma_read` and `seastar::dma_write`. For network, it's `rpc.send()`. -* *Throughput*: For disk, throughput rate is in bytes per second. For network, throughput rate is in bits per second. Note that GiB vs. Gib is the correct notation displayed by the UI. -* *Latency*: 50th, 90th, etc. percentiles of operation latency, reported in microseconds (μs). Represented as P50, P90, P99, P999, and MAX respectively. - -If xref:manage:tiered-storage.adoc[Tiered Storage] is not enabled, the cloud storage tests won't run and a warning will be displayed showing "Cloud storage is not enabled.". All results will be shown as 0. +include::reference:partial$rpk-self-test-descriptions.adoc[] == Usage @@ -69,377 +55,8 @@ Example input: rpk cluster self-test status ---- -Example output, for three-broker cluster: - -[,bash] ----- -NODE ID: 1 | STATUS: IDLE -========================= -NAME 512KB sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:13 UTC 2024 -END TIME Thu Jul 11 18:58:43 UTC 2024 -AVG DURATION 30007ms -IOPS 140 req/sec -THROUGHPUT 70.17MiB/sec -LATENCY P50 P90 P99 P999 MAX - 24575us 47103us 188415us 425983us 507903us - -NAME 512KB sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:43 UTC 2024 -END TIME Thu Jul 11 18:59:13 UTC 2024 -AVG DURATION 30008ms -IOPS 276 req/sec -THROUGHPUT 138.5MiB/sec -LATENCY P50 P90 P99 P999 MAX - 13823us 19455us 24575us 57343us 77823us - -NAME 4KB sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:13 UTC 2024 -END TIME Thu Jul 11 18:59:43 UTC 2024 -AVG DURATION 30000ms -IOPS 6769 req/sec -THROUGHPUT 26.44MiB/sec -LATENCY P50 P90 P99 P999 MAX - 191us 255us 2303us 13311us 81919us - -NAME 4KB sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:43 UTC 2024 -END TIME Thu Jul 11 19:00:13 UTC 2024 -AVG DURATION 30001ms -IOPS 13235 req/sec -THROUGHPUT 51.7MiB/sec -LATENCY P50 P90 P99 P999 MAX - 127us 239us 735us 1919us 63487us - -NAME 8Kb Network Throughput Test -INFO Test performed against node: 2 -TYPE network -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 19:00:13 UTC 2024 -END TIME Thu Jul 11 19:00:43 UTC 2024 -AVG DURATION 30000ms -IOPS 55370 req/sec -THROUGHPUT 3.38Gib/sec -LATENCY P50 P90 P99 P999 MAX - 167us 231us 351us 495us 7679us - -NAME Cloud Storage Test -INFO Put -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 8ms - -NAME Cloud Storage Test -INFO List -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Get -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Head -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 0ms - -NAME Cloud Storage Test -INFO Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Plural Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 47ms - -NODE ID: 2 | STATUS: IDLE -========================= -NAME 512KB sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:13 UTC 2024 -END TIME Thu Jul 11 18:58:43 UTC 2024 -AVG DURATION 30006ms -IOPS 141 req/sec -THROUGHPUT 70.52MiB/sec -LATENCY P50 P90 P99 P999 MAX - 24575us 47103us 188415us 409599us 507903us - -NAME 512KB sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:43 UTC 2024 -END TIME Thu Jul 11 18:59:13 UTC 2024 -AVG DURATION 30011ms -IOPS 279 req/sec -THROUGHPUT 139.5MiB/sec -LATENCY P50 P90 P99 P999 MAX - 13823us 19455us 24575us 57343us 81919us - -NAME 4KB sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:13 UTC 2024 -END TIME Thu Jul 11 18:59:43 UTC 2024 -AVG DURATION 29999ms -IOPS 7045 req/sec -THROUGHPUT 27.52MiB/sec -LATENCY P50 P90 P99 P999 MAX - 191us 255us 2303us 13823us 81919us - -NAME 4KB sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:43 UTC 2024 -END TIME Thu Jul 11 19:00:13 UTC 2024 -AVG DURATION 30000ms -IOPS 13064 req/sec -THROUGHPUT 51.03MiB/sec -LATENCY P50 P90 P99 P999 MAX - 127us 247us 767us 2175us 61439us - -NAME Cloud Storage Test -INFO Put -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 8ms - -NAME Cloud Storage Test -INFO List -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Get -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Head -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 0ms - -NAME Cloud Storage Test -INFO Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Plural Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 47ms - -NODE ID: 0 | STATUS: IDLE -========================= -NAME 512KB sequential r/w throughput disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:13 UTC 2024 -END TIME Thu Jul 11 18:58:43 UTC 2024 -AVG DURATION 30009ms -IOPS 140 req/sec -THROUGHPUT 70.38MiB/sec -LATENCY P50 P90 P99 P999 MAX - 24575us 47103us 180223us 360447us 507903us - -NAME 512KB sequential r/w throughput disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:58:43 UTC 2024 -END TIME Thu Jul 11 18:59:13 UTC 2024 -AVG DURATION 30005ms -IOPS 278 req/sec -THROUGHPUT 139.2MiB/sec -LATENCY P50 P90 P99 P999 MAX - 13823us 19455us 24575us 57343us 77823us - -NAME 4KB sequential r/w latency/iops disk test -INFO write run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:13 UTC 2024 -END TIME Thu Jul 11 18:59:43 UTC 2024 -AVG DURATION 30000ms -IOPS 6767 req/sec -THROUGHPUT 26.43MiB/sec -LATENCY P50 P90 P99 P999 MAX - 191us 255us 2303us 13823us 102399us - -NAME 4KB sequential r/w latency/iops disk test -INFO read run -TYPE disk -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 18:59:43 UTC 2024 -END TIME Thu Jul 11 19:00:13 UTC 2024 -AVG DURATION 30003ms -IOPS 13206 req/sec -THROUGHPUT 51.59MiB/sec -LATENCY P50 P90 P99 P999 MAX - 123us 239us 735us 1855us 63487us - -NAME 8Kb Network Throughput Test -INFO Test performed against node: 1 -TYPE network -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 19:00:13 UTC 2024 -END TIME Thu Jul 11 19:00:43 UTC 2024 -AVG DURATION 30000ms -IOPS 34929 req/sec -THROUGHPUT 2.13Gib/sec -LATENCY P50 P90 P99 P999 MAX - 303us 367us 511us 671us 6399us - -NAME 8Kb Network Throughput Test -INFO Test performed against node: 2 -TYPE network -TEST ID 5632cefe-cf42-44ab-bba9-18fa1c71e3ee -TIMEOUTS 0 -START TIME Thu Jul 11 19:00:43 UTC 2024 -END TIME Thu Jul 11 19:01:13 UTC 2024 -AVG DURATION 30000ms -IOPS 86498 req/sec -THROUGHPUT 5.28Gib/sec -LATENCY P50 P90 P99 P999 MAX - 107us 151us 247us 351us 10239us - -NAME Cloud Storage Test -INFO Put -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 8ms - -NAME Cloud Storage Test -INFO List -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Get -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Head -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 0ms - -NAME Cloud Storage Test -INFO Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 1ms - -NAME Cloud Storage Test -INFO Plural Delete -TYPE cloud -TEST ID a349685a-ee49-4141-8390-c302975db3a5 -TIMEOUTS 0 -START TIME Tue Jul 16 18:06:30 UTC 2024 -END TIME Tue Jul 16 18:06:30 UTC 2024 -AVG DURATION 47ms - - ----- - -NOTE: If self-test returns write results that are unexpectedly and significantly lower than read results, it may be because the Redpanda `rpk` client hardcodes the `DSync` option to `true`. When `DSync` is enabled, files are opened with the `O_DSYNC` flag set, and this represents the actual setting that Redpanda uses when it writes to disk. +.Example output +include::reference:partial$rpk-self-test-status-output.adoc[] == Related topics diff --git a/modules/reference/partials/rpk-self-test-cloud-tests.adoc b/modules/reference/partials/rpk-self-test-cloud-tests.adoc new file mode 100644 index 000000000..fc95b5971 --- /dev/null +++ b/modules/reference/partials/rpk-self-test-cloud-tests.adoc @@ -0,0 +1,6 @@ +. Upload an object (a random buffer of 1024 bytes) to the cloud storage bucket/container. +. List objects in the bucket/container. +. Download the uploaded object from the bucket/container. +. Download the uploaded object's metadata from the bucket/container. +. Delete the uploaded object from the bucket/container. +. Upload and then delete multiple objects (random buffers) at once from the bucket/container. \ No newline at end of file diff --git a/modules/reference/partials/rpk-self-test-descriptions.adoc b/modules/reference/partials/rpk-self-test-descriptions.adoc new file mode 100644 index 000000000..a26449794 --- /dev/null +++ b/modules/reference/partials/rpk-self-test-descriptions.adoc @@ -0,0 +1,15 @@ +Test results are grouped by broker ID. Each test returns the following: + +* *Name*: Description of the test. +* *Info*: Details about the test run attached by Redpanda. +* *Type*: Either `disk`, `network`, or `cloud` test. +* *Test Id*: Unique identifier given to jobs of a run. All IDs in a test should match. If they don't match, then newer and/or older test results have been included erroneously. +* *Timeouts*: Number of timeouts incurred during the test. +* *Start time*: Time that the test started, in UTC. +* *End time*: Time that the test ended, in UTC. +* *Avg Duration*: Duration of the test. +* *IOPS*: Number of operations per second. For disk, it's `seastar::dma_read` and `seastar::dma_write`. For network, it's `rpc.send()`. +* *Throughput*: For disk, throughput rate is in bytes per second. For network, throughput rate is in bits per second. Note that GiB vs. Gib is the correct notation displayed by the UI. +* *Latency*: 50th, 90th, etc. percentiles of operation latency, reported in microseconds (μs). Represented as P50, P90, P99, P999, and MAX respectively. + +If xref:manage:tiered-storage.adoc[Tiered Storage] is not enabled, then cloud storage tests do not run, and a warning displays: "Cloud storage is not enabled." All results are shown as 0. \ No newline at end of file diff --git a/modules/reference/partials/rpk-self-test-status-output.adoc b/modules/reference/partials/rpk-self-test-status-output.adoc new file mode 100644 index 000000000..bcba22798 --- /dev/null +++ b/modules/reference/partials/rpk-self-test-status-output.adoc @@ -0,0 +1,216 @@ +[%collapsible] +==== +[,console] +---- +$ rpk cluster self-test status +NODE ID: 0 | STATUS: IDLE +========================= +NAME 512KB sequential r/w +INFO write run (iodepth: 4, dsync: true) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:02:45 UTC 2024 +END TIME Fri Jul 19 15:03:15 UTC 2024 +AVG DURATION 30002ms +IOPS 1182 req/sec +THROUGHPUT 591.4MiB/sec +LATENCY P50 P90 P99 P999 MAX + 3199us 3839us 9727us 12799us 21503us + +NAME 512KB sequential r/w +INFO read run +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:03:15 UTC 2024 +END TIME Fri Jul 19 15:03:45 UTC 2024 +AVG DURATION 30000ms +IOPS 6652 req/sec +THROUGHPUT 3.248GiB/sec +LATENCY P50 P90 P99 P999 MAX + 607us 671us 831us 991us 2431us + +NAME 4KB sequential r/w, low io depth +INFO write run (iodepth: 1, dsync: true) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:03:45 UTC 2024 +END TIME Fri Jul 19 15:04:15 UTC 2024 +AVG DURATION 30000ms +IOPS 406 req/sec +THROUGHPUT 1.59MiB/sec +LATENCY P50 P90 P99 P999 MAX + 2431us 2559us 2815us 5887us 9215us + +NAME 4KB sequential r/w, low io depth +INFO read run +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:04:15 UTC 2024 +END TIME Fri Jul 19 15:04:45 UTC 2024 +AVG DURATION 30000ms +IOPS 430131 req/sec +THROUGHPUT 1.641GiB/sec +LATENCY P50 P90 P99 P999 MAX + 1us 2us 12us 28us 511us + +NAME 4KB sequential write, medium io depth +INFO write run (iodepth: 8, dsync: true) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:04:45 UTC 2024 +END TIME Fri Jul 19 15:05:15 UTC 2024 +AVG DURATION 30013ms +IOPS 513 req/sec +THROUGHPUT 2.004MiB/sec +LATENCY P50 P90 P99 P999 MAX + 15871us 16383us 21503us 32767us 40959us + +NAME 4KB sequential write, high io depth +INFO write run (iodepth: 64, dsync: true) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:05:15 UTC 2024 +END TIME Fri Jul 19 15:05:45 UTC 2024 +AVG DURATION 30114ms +IOPS 550 req/sec +THROUGHPUT 2.151MiB/sec +LATENCY P50 P90 P99 P999 MAX + 118783us 118783us 147455us 180223us 180223us + +NAME 4KB sequential write, very high io depth +INFO write run (iodepth: 256, dsync: true) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:05:45 UTC 2024 +END TIME Fri Jul 19 15:06:16 UTC 2024 +AVG DURATION 30460ms +IOPS 558 req/sec +THROUGHPUT 2.183MiB/sec +LATENCY P50 P90 P99 P999 MAX + 475135us 491519us 507903us 524287us 524287us + +NAME 4KB sequential write, no dsync +INFO write run (iodepth: 64, dsync: false) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:06:16 UTC 2024 +END TIME Fri Jul 19 15:06:46 UTC 2024 +AVG DURATION 30000ms +IOPS 424997 req/sec +THROUGHPUT 1.621GiB/sec +LATENCY P50 P90 P99 P999 MAX + 135us 183us 303us 543us 9727us + +NAME 16KB sequential r/w, high io depth +INFO write run (iodepth: 64, dsync: false) +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:06:46 UTC 2024 +END TIME Fri Jul 19 15:07:16 UTC 2024 +AVG DURATION 30000ms +IOPS 103047 req/sec +THROUGHPUT 1.572GiB/sec +LATENCY P50 P90 P99 P999 MAX + 511us 1087us 1343us 1471us 15871us + +NAME 16KB sequential r/w, high io depth +INFO read run +TYPE disk +TEST ID 21c5a3de-c75b-480c-8a3d-0cbb63228cb1 +TIMEOUTS 0 +START TIME Fri Jul 19 15:07:16 UTC 2024 +END TIME Fri Jul 19 15:07:46 UTC 2024 +AVG DURATION 30000ms +IOPS 193966 req/sec +THROUGHPUT 2.96GiB/sec +LATENCY P50 P90 P99 P999 MAX + 319us 383us 735us 1023us 6399us + +NAME 8K Network Throughput Test +INFO Test performed against node: 1 +TYPE network +TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d +TIMEOUTS 0 +DURATION 5000ms +IOPS 61612 req/sec +THROUGHPUT 3.76Gib/sec +LATENCY P50 P90 P99 P999 MAX + 159us 207us 303us 431us 1151us + +NAME 8K Network Throughput Test +INFO Test performed against node: 2 +TYPE network +TEST ID 5e4052f3-b828-4c0d-8fd0-b34ff0b6c35d +TIMEOUTS 0 +DURATION 5000ms +IOPS 60306 req/sec +THROUGHPUT 3.68Gib/sec +LATENCY P50 P90 P99 P999 MAX + 159us 215us 351us 495us 11263us + +NAME Cloud Storage Test +INFO Put +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 8ms + +NAME Cloud Storage Test +INFO List +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 1ms + +NAME Cloud Storage Test +INFO Get +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 1ms + +NAME Cloud Storage Test +INFO Head +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 0ms + +NAME Cloud Storage Test +INFO Delete +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 1ms + +NAME Cloud Storage Test +INFO Plural Delete +TYPE cloud +TEST ID a349685a-ee49-4141-8390-c302975db3a5 +TIMEOUTS 0 +START TIME Tue Jul 16 18:06:30 UTC 2024 +END TIME Tue Jul 16 18:06:30 UTC 2024 +AVG DURATION 47ms +---- +==== + +NOTE: If self-test returns write results that are unexpectedly and significantly lower than read results, it may be because the Redpanda `rpk` client hardcodes the `DSync` option to `true`. When `DSync` is enabled, files are opened with the `O_DSYNC` flag set, and this represents the actual setting that Redpanda uses when it writes to disk. \ No newline at end of file