Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: introduce a couple of replication reports #40625

Merged
merged 1 commit into from
Sep 10, 2019

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Sep 10, 2019

This patch implements the "Insights into Constraint Conformance" RFC
(#38309). At the time of this patch, the RFC is still pending and also
out of date.
The patch introduces the following tables in the system database,
providing information about constraint conformance, replication status
and critical localities (i.e. localities that, were they to become
unavailable, would cause some ranges to lose quorum):

  CREATE TABLE replication_constraint_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      type STRING NOT NULL,
      config STRING NOT NULL,
      report_id INT8 NOT NULL,
      violation_start TIMESTAMP NULL,
      violating_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, type ASC, config ASC),
      FAMILY "primary" (zone_id, subzone_id, type, config, report_id, violation_start, violating_ranges)
  );

  CREATE TABLE replication_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      report_id INT8 NOT NULL,
      total_ranges INT8 NOT NULL,
      unavailable_ranges INT8 NOT NULL,
      under_replicated_ranges INT8 NOT NULL,
      over_replicated_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC),
      FAMILY "primary" (zone_id, subzone_id, report_id, total_ranges, unavailable_ranges, under_replicated_ranges, over_replicated_ranges)
  );

    CREATE TABLE replication_critical_localities (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      locality STRING NOT NULL,
      report_id INT8 NOT NULL,
      at_risk_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, locality ASC),
      FAMILY "primary" (zone_id, subzone_id, locality, report_id, at_risk_ranges)
  )

And also a system.report_meta table with metadata for all these reports
(their creation time).

The reports are generated periodically (once a minute by default,
subject to the cluster setting kv.replication_reports.interval) by a job
running on the leaseholder of range 1.
The data is produced by joing range descriptor data from meta2 with zone
config information from the gossiped SystemConfig.

Release note (sql change): The following system tables containing report about
replication status, constraint conformance and critical localities are
introduced: replication_constraint_stats, replication_stats,
replication_critical_localities.

@andreimatei andreimatei requested a review from a team as a code owner September 10, 2019 06:10
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei andreimatei requested review from a team as code owners September 10, 2019 07:24
@andreimatei andreimatei changed the title Constraints report wip storage: introduce a couple of replication reports Sep 10, 2019
@andreimatei andreimatei removed the request for review from a team September 10, 2019 08:42
@andreimatei
Copy link
Contributor Author

no review on the PR; developed with @darinpp

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 10, 2019

Build failed

@andreimatei
Copy link
Contributor Author

failed on logictest timeout - #40572

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 10, 2019

Build failed

This patch implements the "Insights into Constraint Conformance" RFC
(cockroachdb#38309). At the time of this patch, the RFC is still pending and also
out of date.
Developed together with Darin.

The patch introduces the following tables in the system database,
providing information about constraint conformance, replication status
and critical localities (i.e. localities that, were they to become
unavailable, would cause some ranges to lose quorum):
  CREATE TABLE replication_constraint_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      type STRING NOT NULL,
      config STRING NOT NULL,
      report_id INT8 NOT NULL,
      violation_start TIMESTAMP NULL,
      violating_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, type ASC, config ASC),
      FAMILY "primary" (zone_id, subzone_id, type, config, report_id, violation_start, violating_ranges)
  );

  CREATE TABLE replication_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      report_id INT8 NOT NULL,
      total_ranges INT8 NOT NULL,
      unavailable_ranges INT8 NOT NULL,
      under_replicated_ranges INT8 NOT NULL,
      over_replicated_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC),
      FAMILY "primary" (zone_id, subzone_id, report_id, total_ranges, unavailable_ranges, under_replicated_ranges, over_replicated_ranges)
  );

    CREATE TABLE replication_critical_localities (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      locality STRING NOT NULL,
      report_id INT8 NOT NULL,
      at_risk_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, locality ASC),
      FAMILY "primary" (zone_id, subzone_id, locality, report_id, at_risk_ranges)
  )

And also a system.report_meta table with metadata for all these reports
(their creation time).

The reports are generated periodically (once a minute by default,
subject to the cluster setting kv.replication_reports.interval) by a job
running on the leaseholder of range 1.
The data is produced by joing range descriptor data from meta2 with zone
config information from the gossiped SystemConfig.

Release note (sql change): The following system tables containing report about
replication status, constraint conformance and critical localities are
introduced: replication_constraint_stats, replication_stats,
replication_critical_localities.
@andreimatei
Copy link
Contributor Author

some sort of a race in a logic test. I think I've fixed it.

bors r+

craig bot pushed a commit that referenced this pull request Sep 10, 2019
40625: storage: introduce a couple of replication reports r=andreimatei a=andreimatei

This patch implements the "Insights into Constraint Conformance" RFC
(#38309). At the time of this patch, the RFC is still pending and also
out of date.
The patch introduces the following tables in the system database,
providing information about constraint conformance, replication status
and critical localities (i.e. localities that, were they to become
unavailable, would cause some ranges to lose quorum):
```
  CREATE TABLE replication_constraint_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      type STRING NOT NULL,
      config STRING NOT NULL,
      report_id INT8 NOT NULL,
      violation_start TIMESTAMP NULL,
      violating_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, type ASC, config ASC),
      FAMILY "primary" (zone_id, subzone_id, type, config, report_id, violation_start, violating_ranges)
  );

  CREATE TABLE replication_stats (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      report_id INT8 NOT NULL,
      total_ranges INT8 NOT NULL,
      unavailable_ranges INT8 NOT NULL,
      under_replicated_ranges INT8 NOT NULL,
      over_replicated_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC),
      FAMILY "primary" (zone_id, subzone_id, report_id, total_ranges, unavailable_ranges, under_replicated_ranges, over_replicated_ranges)
  );

    CREATE TABLE replication_critical_localities (
      zone_id INT8 NOT NULL,
      subzone_id INT8 NOT NULL,
      locality STRING NOT NULL,
      report_id INT8 NOT NULL,
      at_risk_ranges INT8 NOT NULL,
      CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, locality ASC),
      FAMILY "primary" (zone_id, subzone_id, locality, report_id, at_risk_ranges)
  )
```

And also a system.report_meta table with metadata for all these reports
(their creation time).

The reports are generated periodically (once a minute by default,
subject to the cluster setting kv.replication_reports.interval) by a job
running on the leaseholder of range 1.
The data is produced by joing range descriptor data from meta2 with zone
config information from the gossiped SystemConfig.

Release note (sql change): The following system tables containing report about
replication status, constraint conformance and critical localities are
introduced: replication_constraint_stats, replication_stats,
replication_critical_localities.

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Sep 10, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants