Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streamingccl,db-console: add Cross-Cluster Replication metrics page #96206

Merged
merged 2 commits into from
Feb 7, 2023

Conversation

adityamaru
Copy link
Contributor

This change adds a new option to the metric dropdown dashboard
for Cross-Cluster Replication. This page will show all the user-facing
metrics relevant to C2C replication.

Informs: #95995

Release note: None

server,db-console: add feature flag for Cross-Cluster Replication dashboard

This change adds a feature flag whose value is determined by the cluster
setting cross_cluster_replication.enabled. This feature flag is then used
when rendering the metrics dashboard dropdown options to decide whether to
display the cross-cluster replication dashboard.

This change also deletes the session variable and associated cluster setting
sql.defaults.experimental_stream_replication.enabled so as to unify all
interactions with cross-cluster replication under a single cluster setting.

Fixes: #95995

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch 2 times, most recently from 99c3e48 to 0eb4eac Compare February 1, 2023 23:23
@adityamaru
Copy link
Contributor Author

Adding some screenshots of the above change:

  • With set cluster setting cross_cluster_replication.enabled = false;

Screenshot 2023-02-01 at 6 27 37 PM

  • With set cluster setting cross_cluster_replication.enabled = true;

Screenshot 2023-02-01 at 6 28 04 PM

Screenshot 2023-02-01 at 6 28 17 PM

Screenshot 2023-02-01 at 6 28 22 PM

@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch from 0eb4eac to 5cde7e2 Compare February 1, 2023 23:35
@adityamaru adityamaru marked this pull request as ready for review February 1, 2023 23:35
@adityamaru adityamaru requested a review from a team as a code owner February 1, 2023 23:35
@adityamaru adityamaru requested review from a team February 1, 2023 23:35
@adityamaru adityamaru requested review from a team as code owners February 1, 2023 23:35
@adityamaru adityamaru requested review from herkolategan, srosenberg and cucaroach and removed request for a team, herkolategan and cucaroach February 1, 2023 23:35
@adityamaru
Copy link
Contributor Author

I've been talking to @benbardin about this diff so I expect him to lead the review, but @stevendanna / @lidorcarmel it would be nice to get your LGTM on some of the C2C cluster setting + metrics changes. Thanks!

@@ -17,8 +17,10 @@ import (

"github.com/NYTimes/gziphandler"
"github.com/cockroachdb/cmux"
"github.com/cockroachdb/cockroach/pkg/ccl/streamingccl"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh this is no bueno, i'll have to inject this value or move the cluster setting to a non-ccl location.

Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @benbardin, @lidorcarmel, @srosenberg, and @stevendanna)


pkg/ccl/streamingccl/settings.go line 19 at r2 (raw file):

// CrossClusterReplicationEnabled enables the ability to setup and control a
// cross cluster replication stream.
var CrossClusterReplicationEnabled = settings.RegisterBoolSetting(

if you wanted to, you could also read the cluster setting in the frontend instead of also wiring up a feature flag. the cluster settings could be more easily refreshed at runtime in the client as well.

if you look in clusterSettings.selectors.ts there are examples of cluster setting redux selectors you could wire up into nodeGraphs.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 109 at r1 (raw file):

  crossClusterReplication: {
    label: "Cross-Cluster Replication",
    component: crossClusterReplication,

nit: just for consistency, calling this a *Dashboard would be nice.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 39 at r1 (raw file):

    </LineGraph>,
    <LineGraph
      title="KV Bytes"

do the byte computations need to be measured as "rates"? they're monotonic counters, right?


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 48 at r1 (raw file):

          title="KV Bytes"
          downsampleMax
          aggregateMax

you might want aggregateSum here. The aggregator decides how metrics from multiple nodes are aggregated together when someone looks at a "cluster" view of the metrics. I think in the case of byte counters you want to add them. I believe sum is the default aggregator (see queryFromProps in metricDataProvider/index.tsx) so you can also omit it.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 55 at r1 (raw file):

      title="SST Bytes"
      sources={storeSources}
      tooltip={`SST bytes (compressed) sent to KV by all replication jobs`}

do customers know how to disambiguate KV bytes and SST bytes?


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 62 at r1 (raw file):

          title="SST Bytes"
          downsampleMax
          aggregateMax

same as above, probably want "sum".

Copy link
Collaborator

@benbardin benbardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing to add beyond david's comments and your own! though I'd certainly appreciate somebody who knows streamingest better reading that directory.

@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch from 5cde7e2 to 487a007 Compare February 3, 2023 17:10
@adityamaru adityamaru requested a review from a team as a code owner February 3, 2023 17:10
Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @lidorcarmel, @srosenberg, and @stevendanna)


pkg/ccl/streamingccl/settings.go line 19 at r2 (raw file):

Previously, dhartunian (David Hartunian) wrote…

if you wanted to, you could also read the cluster setting in the frontend instead of also wiring up a feature flag. the cluster settings could be more easily refreshed at runtime in the client as well.

if you look in clusterSettings.selectors.ts there are examples of cluster setting redux selectors you could wire up into nodeGraphs.

Thanks for this, it makes the change smaller! I did have to move the filter call into render instead of leaving it in the constructor to get things working. I did this because we refresh the settings in componentDidMount and so the state is only set after that method is invoked. I'm very new to redux/react so let me know if there is a better place I should move the filter so that it is not invoked on each render.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 109 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

nit: just for consistency, calling this a *Dashboard would be nice.

ahh yes, fixed.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 39 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

do the byte computations need to be measured as "rates"? they're monotonic counters, right?

They are monotonic counters. After reading Snapshot Data Received in replication.tsx I realize I should add nonNegativeRate right?


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 48 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

you might want aggregateSum here. The aggregator decides how metrics from multiple nodes are aggregated together when someone looks at a "cluster" view of the metrics. I think in the case of byte counters you want to add them. I believe sum is the default aggregator (see queryFromProps in metricDataProvider/index.tsx) so you can also omit it.

Yup you're right, thanks. I believe I can omit both the downsample and aggregate params.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 55 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

do customers know how to disambiguate KV bytes and SST bytes?

Good question, and I think the answer is no. Another alternative is to refer to them as logical bytes and physical bytes. We do have precedence for these terms in https://www.cockroachlabs.com/docs/stable/ui-storage-dashboard.html, @lidorcarmel curious what you think?


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 62 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

same as above, probably want "sum".

Done.

@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch from 487a007 to 14a43b5 Compare February 3, 2023 17:16
@adityamaru adityamaru requested a review from dhartunian February 3, 2023 17:17
Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 20 of 20 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @lidorcarmel, @srosenberg, and @stevendanna)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 176 at r4 (raw file):

  refresh = () => {
    this.props.refreshNodes();
    this.props.refreshLiveness();

consider adding a refreshSettings() prop here (via the Selectors), and calling it. This will ensure that settings changes are refreshed when the metrics page is loaded since we now depend on them. will hopefully prevent one or two support tickets where ppl don't see what they expect.

This prop will be added in the mapDispatchToProps redux helper. (MapStateToProps lets your component access data from the Redux store, and MapDispatchToProps gives your component handles to action dispatchers that can mutate the redux store)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 332 at r4 (raw file):

      : dashboardDropdownOptions.filter(
          option => option.label !== "Cross-Cluster Replication",
        );

This is fine to put here. If the list was long, or the filter was complex, it would be worth memoizing but there's plenty of other stuff slowing the code down here already to load and render metrics.

What you did is actually recommended here https://reactjs.org/blog/2018/06/07/you-probably-dont-need-derived-state.html#what-about-memoization until you hit performance constraints.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 39 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

They are monotonic counters. After reading Snapshot Data Received in replication.tsx I realize I should add nonNegativeRate right?

👍


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 55 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

Good question, and I think the answer is no. Another alternative is to refer to them as logical bytes and physical bytes. We do have precedence for these terms in https://www.cockroachlabs.com/docs/stable/ui-storage-dashboard.html, @lidorcarmel curious what you think?

SGTM. Especially since we have documentation around the difference.

Copy link
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

streamingest stuff lgtm.

Copy link
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @dhartunian, @srosenberg, and @stevendanna)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 55 at r1 (raw file):

Previously, dhartunian (David Hartunian) wrote…

SGTM. Especially since we have documentation around the difference.

the kv bytes can be renamed to logical, I think it's accurate.

but the sst bytes is not exactly physical - it is the number of bytes in ssts that replication sends to kv, which is not really the physical bytes: physical is (I think) the bytes written to the disk. the sst bytes are different because kv may uncompress the sst (if it's small) and also, pebble may compact which will write more to the disk. I think it's ok to have the name say exactly what we do in this case.

Copy link
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adityamaru you might want to update some text in chart_catalog.go for those metric renames.

This change adds a new option to the metric dropdown dashboard
for Cross-Cluster Replication. This page will show all the user-facing
metrics relevant to C2C replication.

Informs: cockroachdb#95995

Release note: None
@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch from 14a43b5 to 005b74d Compare February 6, 2023 23:59
Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup nice catch, done!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @lidorcarmel, @srosenberg, and @stevendanna)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 176 at r4 (raw file):

Previously, dhartunian (David Hartunian) wrote…

consider adding a refreshSettings() prop here (via the Selectors), and calling it. This will ensure that settings changes are refreshed when the metrics page is loaded since we now depend on them. will hopefully prevent one or two support tickets where ppl don't see what they expect.

This prop will be added in the mapDispatchToProps redux helper. (MapStateToProps lets your component access data from the Redux store, and MapDispatchToProps gives your component handles to action dispatchers that can mutate the redux store)

We already have a refreshNodeSettings property. Following it through it seems to be driven by the refreshSettings in apiReducers.ts which looks like what we want.


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 332 at r4 (raw file):

Previously, dhartunian (David Hartunian) wrote…

This is fine to put here. If the list was long, or the filter was complex, it would be worth memoizing but there's plenty of other stuff slowing the code down here already to load and render metrics.

What you did is actually recommended here https://reactjs.org/blog/2018/06/07/you-probably-dont-need-derived-state.html#what-about-memoization until you hit performance constraints.

perfect, thanks


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/dashboards/crossClusterReplication.tsx line 55 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

the kv bytes can be renamed to logical, I think it's accurate.

but the sst bytes is not exactly physical - it is the number of bytes in ssts that replication sends to kv, which is not really the physical bytes: physical is (I think) the bytes written to the disk. the sst bytes are different because kv may uncompress the sst (if it's small) and also, pebble may compact which will write more to the disk. I think it's ok to have the name say exactly what we do in this case.

makes sense to me, changed kv bytes to logical bytes.

@adityamaru adityamaru requested a review from dhartunian February 7, 2023 00:14
…hboard

This change adds a feature flag whose value is determined by the cluster
setting `cross_cluster_replication.enabled`. This feature flag is then used
when rendering the metrics dashboard dropdown options to decide whether to
display the cross-cluster replication dashboard.

This change also deletes the session variable and associated cluster setting
`sql.defaults.experimental_stream_replication.enabled` so as to unify all
interactions with cross-cluster replication under a single cluster setting.

Fixes: cockroachdb#95995

Release note: None
@adityamaru adityamaru force-pushed the c2c-rep-debug-console branch from 005b74d to bbc6ec4 Compare February 7, 2023 14:44
Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @lidorcarmel, @srosenberg, and @stevendanna)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodeGraphs/index.tsx line 176 at r4 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

We already have a refreshNodeSettings property. Following it through it seems to be driven by the refreshSettings in apiReducers.ts which looks like what we want.

ah nice! I missed that. 👍

@adityamaru
Copy link
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 7, 2023

Build succeeded:

@craig craig bot merged commit f640acb into cockroachdb:master Feb 7, 2023
@adityamaru adityamaru deleted the c2c-rep-debug-console branch February 7, 2023 17:29
@rafiss
Copy link
Collaborator

rafiss commented Feb 7, 2023

@adityamaru exciting change! since this is adding a user-visible page to the DB console, it needs a release note. or since this was already merged, we'd need to make a DOCS ticket so that the page gets documented at https://www.cockroachlabs.com/docs/v22.2/ui-overview

@adityamaru
Copy link
Contributor Author

it needs a release note. or since this was already merged, we'd need to make a DOCS ticket so that the page gets documented

Since we only plan on releasing C2C for private preview in 23.1 we've decided not to have public-facing documentation or release notes for any of the work around this new feature. In the future when we do decide to release it publically we'll have to ensure we document this along with many of the other changes that have gone in over the past release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

c2c: add a cross cluster replication dashboard to the DB console
6 participants