Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfig,kv: merge adjacent ranges with identical configs #79700

Merged
merged 5 commits into from
Apr 29, 2022

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Apr 8, 2022

spanconfig,kv: merge adjacent ranges with identical configs

Fixes #72389.
Fixes #66063 (gated behind a cluster setting).

This should drastically reduce the total number of ranges in the system,
especially when running with a large number of tables and/or tenants. To
understand what the new set of split points are, consider the following
test snippet:

exec-sql tenant=11
CREATE DATABASE db;
CREATE TABLE db.t0();
CREATE TABLE db.t1();
CREATE TABLE db.t2();
CREATE TABLE db.t3();
CREATE TABLE db.t4();
CREATE TABLE db.t5();
CREATE TABLE db.t6();
CREATE TABLE db.t7();
CREATE TABLE db.t8();
CREATE TABLE db.t9();
ALTER TABLE db.t5 CONFIGURE ZONE using num_replicas = 42;
----

# If installing a unique zone config for a table in the middle, we
# should observe three splits (before, the table itself, and after).

diff offset=48
----
--- gossiped system config span (legacy)
+++ span config infrastructure (current)
...
 /Tenant/10                                 database system (tenant)
 /Tenant/11                                 database system (tenant)
+/Tenant/11/Table/106                       range default
+/Tenant/11/Table/111                       num_replicas=42
+/Tenant/11/Table/112                       range default

This PR introduces two cluster settings to selectively opt into this
optimization: spanconfig.{tenant,host}_coalesce_adjacent.enabled
(defaulting to true and false respectively). We also don't coalesce
system table ranges on the host tenant. We had a few implementation
choices here:

(a) Down in KV, augment the spanconfig.Store to coalesce the in-memory
state when updating entries. On every update, we'd check if the span
we're writing to has the same config as the preceding and/or
succeeding one, and if so, write out a larger span instead. We
previously prototyped a form of this in #68491.

Pros:
- reduced memory footprint of spanconfig.Store
- read path (computing splits) stays fast -- just have to look up
  the right span from the backing interval tree
Cons:
- uses more persisted storage than necessary
- difficult to disable the optimization dynamically (though still
  possible -- we'd effectively have to restart the KVSubscriber and
  populate in-memory span config state using a store that
  does/doesn't coalesce configs)
- difficult to implement; our original prototype did not have #73150
  which is important for reducing reconciliation round trips

(b) Same as (a) but coalesce configs up in the spanconfig.Store
maintained in reconciler itself.

Pros:
- reduced storage use (both persisted and in-memory)
- read path (computing splits) stays fast -- just have to look up
  the right span from the backing interval tree
Cons:
- very difficult to disable the optimization dynamically (each
  tenant process would need to be involved)
- most difficult to implement

(c) Same as (a) but through another API on the spanconfig.Store
interface that accepts only a single update at a time and does not
generate deltas (not something we need down in KV). Removes the
implementation complexity.

(d) Keep the contents of system.span_configurations and the in-memory
state of spanconfig.Stores as it is today, uncoalesced. When
determining split points, iterate through adjacent configs within
the provided key bounds and see whether we could ignore certain
split keys.

Pros:
- easiest to implement
- easy to disable the optimization dynamically, for ex. through a
  cluster setting
Cons:
- uses more storage (persisted and in-memory) than necessary
- read path (computing splits) is more expensive if iterating
  through adjacent configs

This PR implements option (d). For a benchmark on how slow (d) is going
to be in practice with varying numbers of entries to be scanning over
(10k, 100k, 1m):

    $ dev bench pkg/spanconfig/spanconfigstore -f=BenchmarkStoreComputeSplitKey -v
    BenchmarkStoreComputeSplitKey
    BenchmarkStoreComputeSplitKey/num-entries=10000
    BenchmarkStoreComputeSplitKey/num-entries=10000-10               1166842 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=100000
    BenchmarkStoreComputeSplitKey/num-entries=100000-10             12273375 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=1000000
    BenchmarkStoreComputeSplitKey/num-entries=1000000-10           140591766 ns/op
    PASS

It's feasible that in the future we re-work this in favor of (c).


spanconfig/store: use templatized btree impl instead

   $ dev bench pkg/spanconfig/spanconfigstore -f=BenchmarkStoreComputeSplitKey -v
    BenchmarkStoreComputeSplitKey
    BenchmarkStoreComputeSplitKey/num-entries=10000
    BenchmarkStoreComputeSplitKey/num-entries=10000-10                431093 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=100000
    BenchmarkStoreComputeSplitKey/num-entries=100000-10              4308200 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=1000000
    BenchmarkStoreComputeSplitKey/num-entries=1000000-10            43827373 ns/op
    PASS

    $ benchstat old.txt new.txt # from previous commit
    name                                         old time/op  new time/op  delta
    StoreComputeSplitKey/num-entries=10000-10    1.17ms ± 0%  0.43ms ± 0%   ~     (p=1.000 n=1+1)
    StoreComputeSplitKey/num-entries=100000-10   12.4ms ± 0%   4.3ms ± 0%   ~     (p=1.000 n=1+1)
    StoreComputeSplitKey/num-entries=1000000-10   136ms ± 0%    44ms ± 0%   ~     (p=1.000 n=1+1)

spanconfig/store: intern span configs

  • Avoid the expensive proto.Equal() when computing split keys;
  • Reduce memory overhead of the data structure
    $ dev bench pkg/spanconfig/spanconfigstore -f=BenchmarkStoreComputeSplitKey -v
    BenchmarkStoreComputeSplitKey
    BenchmarkStoreComputeSplitKey/num-entries=10000
    BenchmarkStoreComputeSplitKey/num-entries=10000-10                 90323 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=100000
    BenchmarkStoreComputeSplitKey/num-entries=100000-10               915936 ns/op
    BenchmarkStoreComputeSplitKey/num-entries=1000000
    BenchmarkStoreComputeSplitKey/num-entries=1000000-10             9575781 ns/op

    $ benchstat old.txt new.txt # from previous commit
    name                                         old time/op  new time/op  delta
    StoreComputeSplitKey/num-entries=10000-10     431µs ± 0%    90µs ± 0%   ~     (p=1.000 n=1+1)
    StoreComputeSplitKey/num-entries=100000-10   4.31ms ± 0%  0.92ms ± 0%   ~     (p=1.000 n=1+1)
    StoreComputeSplitKey/num-entries=1000000-10  43.8ms ± 0%   9.6ms ± 0%   ~     (p=1.000 n=1+1)

Release note: None
Release justification: high benefit change to existing functionality
(affecting only multi-tenant clusters).

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif force-pushed the 220408.coalesce-spanconfigs branch 7 times, most recently from 8d56409 to 0b02b96 Compare April 13, 2022 21:18
@irfansharif irfansharif marked this pull request as ready for review April 13, 2022 21:19
@irfansharif irfansharif requested review from a team as code owners April 13, 2022 21:19
@irfansharif
Copy link
Contributor Author

This should be ready for review. There are some random out-of-package tests that need to be updated since they expect splits post-table creation (they were added after we introduced tenant range splits), and I also want to write a small benchmark for the read path. Like discussed offline, plan for this PR is to let it bake on master for a few days before backporting to 22.1, aiming for the .1 release. I also think this PR makes #78299 redundant (LMK if you disagree) -- so I'm just going to close that as well.

@irfansharif irfansharif force-pushed the 220408.coalesce-spanconfigs branch 2 times, most recently from 6209b5f to dcce1db Compare April 14, 2022 05:43
@irfansharif irfansharif requested review from a team and removed request for a team April 14, 2022 05:43
@irfansharif irfansharif force-pushed the 220408.coalesce-spanconfigs branch 3 times, most recently from bda1c2c to 62ad12e Compare April 14, 2022 15:58
@irfansharif
Copy link
Contributor Author

There are some random out-of-package tests that need to be updated since they expect splits post-table creation (they were added after we introduced tenant range splits), and I also want to write a small benchmark for the read path.

Done and done.

@BramGruneir
Copy link
Member

Would it be possible to backport this to 22.1?

@irfansharif
Copy link
Contributor Author

Yup.

plan for this PR is to let it bake on master for a few days before backporting to 22.1, aiming for the .1 release.

For single-tenant clusters it won't be enabled by default; there'll be an opt-in cluster setting.

@BramGruneir
Copy link
Member

Awesome! Thanks.

@irfansharif
Copy link
Contributor Author

@ajwerner, @nvanbenschoten and @arulajmani: Bump for reviews. Like discussed elsewhere, this PR should get baking time to build confidence. We're also aiming for this to be part of 22.1.1 when rolling out to multi-tenant clusters.

@craig
Copy link
Contributor

craig bot commented Apr 29, 2022

Build succeeded:

@craig craig bot merged commit 294e548 into cockroachdb:master Apr 29, 2022
@irfansharif irfansharif deleted the 220408.coalesce-spanconfigs branch April 29, 2022 22:47
@theodore-hyman
Copy link
Contributor

theodore-hyman commented May 2, 2022

@irfansharif Thank you for your time and attention on this matter. I am chiming in just to confirm the timeline here as a quick sanity check.

Again, I know you just said that this would be backported to v22.1.1 and you just said:

I'll open the backport now but not merge to the release until some point mid next week

However just re-confirming, apologies for the redundancy.

My understanding is that this fix, in the form of a new hidden config option, will be released as part of v22.1.1 that is scheduled for public release June 6, 2022. Please confirm or let me know if I am misunderstanding. Thank you

@irfansharif
Copy link
Contributor Author

The backport (#80811) was merged, so it'll make it to 22.1.1 whenever that goes out. That said, I don't know which customer you're referring to (DM?) and whether this would help. This PR wasn't intended as a fix for anything, so let's discuss elsewhere what we're expecting this to help with.

zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 7, 2023
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation has three components. The first is spanstats.Accessor,
which provides an interface for accessing span stats from KV.

The second is spanstatsaccessor.LocalAccessor, which provides an implementation
of the interface for callers that are co-located on a KV node.

The third is an implementation by kvtenantccl.Connector, which provides an
implementation of the interface for non co-located callers, like SQL pods.

Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note (backward-incompatible change): The SpanStatsRequest message
field 'node_id' has changed from type 'string' to type 'int32'.
zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 7, 2023
This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation has three components. The first is spanstats.Accessor,
which provides an interface for accessing span stats from KV.

The second is spanstatsaccessor.LocalAccessor, which provides an implementation
of the interface for callers that are co-located on a KV node.

The third is an implementation by kvtenantccl.Connector, which provides an
implementation of the interface for non co-located callers, like SQL pods.

The interaction between these components is illustrated below:

 System Tenant
+----------------------------------------------------+
|                                                    |
|                                                    |
|                                    KV Node fan-out |
|                                     +----------+   |
|                                     |          |   |
|     +-------------------------------v----+     |   |
|     |                                    |     |   |
|     |  serverpb.InternalSpanStatsServer  +-----+   |
|     |                                    |         |
|     +----------------+-------------------+         |
|                      |                             |
|         roachpb.InternalSpanStatsResponse          |
|                      |                             |
|                      |                             |
|                      v                             |
|    +----------------------------------+            |
|    |                                  |            |
|    |  spanstatsaccessor.LocalAccessor |            |
|    |                                  |            |
|    +-----------------+----------------+            |
|                      |                             |
|                      +---------------------+       |
|         roachpb.InternalSpanStatsResponse  |       |
|                      |                     |       |
|                      v                     v       |
|         +-------------------------+  +-----------+ |
|         |                         |  |           | |
|         |  serverpb.StatusServer  |  | SQLServer | |
|         |                         |  |           | |
|         +------------+------------+  +-----------+ |
|                      |                             |
|                      |                             |
+----------------------+-----------------------------+
                       |
                       |  serverpb.SpanStatsResponse
                       |
                       |
                       |
                       |
                       |
 Secondary Tenant      |
+----------------------+------------------------------+
|                      |                              |
|         +------------v-------------+                |
|         |                          |                |
|         |   kvtenantccl.Connector  |                |
|         |                          |                |
|         +------------+-------------+                |
|                      |                              |
|                      |                              |
|                      +--------------------+         |
|          roachpb.InternalSpanStatsResponse|         |
|                      |                    |         |
|                      |                    |         |
|    +-----------------v-------+     +------v------+  |
|    |                         |     |             |  |
|    |  serverpb.StatusServer  |     |  SQLServer  |  |
|    |                         |     |             |  |
|    +-------------------------+     +-------------+  |
|                                                     |
|                                                     |
|                                                     |
+-----------------------------------------------------+

Resolves cockroachdb#84105
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note (backward-incompatible change): The SpanStatsRequest message
field 'node_id' has changed from type 'string' to type 'int32'.
zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 8, 2023
This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation allows secondary tenants to access span stats
via the serverpb.TenantStatusServer interface.

If a roachpb.SpanStatsRequest has a node_id value of 0, instead of a specific
node id, the response will yield cluster-wide values. To achieve this,
the server does a fan-out to nodes that are known to have replicas for the
span requested.

The interaction between tenants is illustrated below:

 System Tenant
+----------------------------------------------------+
|                                                    |
|                                                    |
|                                    KV Node fan-out |
|                                     +----------+   |
|                                     |          |   |
|     +-------------------------------v----+     |   |
|     |                                    |     |   |
|     |      server.systemStatusServer     +-----+   |
|     |                                    |         |
|     +----------------+-------------------+         |
|                      |                             |
|                      |                             |
|                      |     via TenantStatusServer  |
|                      +--------------+              |
|                      |              |              |
|                      |              |              |
|                      |              v              |
|                      |        +-----------+        |
|                      |        |           |        |
|                      |        | SQLServer |        |
|                      |        |           |        |
|                      |        +-----------+        |
|                      |                             |
+----------------------v-----------------------------+
                       |
                       |
                       |
                       |   roachpb.SpanStatsResponse
                       |
                       |
 Secondary Tenant      |
+----------------------+------------------------------+
|                      |                              |
|         +------------v-------------+                |
|         |            |             |                |
|         |   kvtenantccl.Connector  |                |
|         |                          |                |
|         +------------+-------------+                |
|                      |                              |
|                      |    via TenantStatusServer    |
|                      +--------------------+         |
|                      |                    |         |
|                      |                    |         |
|                      |                    |         |
|    +-----------------v-------+     +------v------+  |
|    |                         |     |             |  |
|    |    server.statusServer  |     |  SQLServer  |  |
|    |                         |     |             |  |
|    +-------------------------+     +-------------+  |
|                                                     |
|                                                     |
|                                                     |
+-----------------------------------------------------+

Resolves cockroachdb#84105
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note (backward-incompatible change): The SpanStatsRequest message
field 'node_id' has changed from type 'string' to type 'int32'.

refactor away from accessor interface

remove print
zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 9, 2023
This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation allows secondary tenants to access span stats
via the serverpb.TenantStatusServer interface.

If a roachpb.SpanStatsRequest has a node_id value of 0, instead of a specific
node id, the response will yield cluster-wide values. To achieve this,
the server does a fan-out to nodes that are known to have replicas for the
span requested.

The interaction between tenants is illustrated below:

 System Tenant
+----------------------------------------------------+
|                                                    |
|                                                    |
|                                    KV Node fan-out |
|                                     +----------+   |
|                                     |          |   |
|     +-------------------------------v----+     |   |
|     |                                    |     |   |
|     |      server.systemStatusServer     +-----+   |
|     |                                    |         |
|     +----------------+-------------------+         |
|                      |                             |
|                      |                             |
|                      |     via TenantStatusServer  |
|                      +--------------+              |
|                      |              |              |
|                      |              |              |
|                      |              v              |
|                      |        +-----------+        |
|                      |        |           |        |
|                      |        | SQLServer |        |
|                      |        |           |        |
|                      |        +-----------+        |
|                      |                             |
+----------------------v-----------------------------+
                       |
                       |
                       |
                       |   roachpb.SpanStatsResponse
                       |
                       |
 Secondary Tenant      |
+----------------------+------------------------------+
|                      |                              |
|         +------------v-------------+                |
|         |            |             |                |
|         |   kvtenantccl.Connector  |                |
|         |                          |                |
|         +------------+-------------+                |
|                      |                              |
|                      |    via TenantStatusServer    |
|                      +--------------------+         |
|                      |                    |         |
|                      |                    |         |
|                      |                    |         |
|    +-----------------v-------+     +------v------+  |
|    |                         |     |             |  |
|    |    server.statusServer  |     |  SQLServer  |  |
|    |                         |     |             |  |
|    +-------------------------+     +-------------+  |
|                                                     |
|                                                     |
|                                                     |
+-----------------------------------------------------+

Resolves cockroachdb#84105
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note (backward-incompatible change): The SpanStatsRequest message
field 'node_id' has changed from type 'string' to type 'int32'.

refactor away from accessor interface

remove print
zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 13, 2023
This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation allows secondary tenants to access span stats
via the serverpb.TenantStatusServer interface.

If a roachpb.SpanStatsRequest has a node_id value of 0, instead of a specific
node id, the response will yield cluster-wide values. To achieve this,
the server does a fan-out to nodes that are known to have replicas for the
span requested.

The interaction between tenants is illustrated below:

 System Tenant
+----------------------------------------------------+
|                                                    |
|                                                    |
|                                    KV Node fan-out |
|                                     +----------+   |
|                                     |          |   |
|     +-------------------------------v----+     |   |
|     |                                    |     |   |
|     |      server.systemStatusServer     +-----+   |
|     |                                    |         |
|     +----------------+-------------------+         |
|                      |                             |
|                      |                             |
|                      |     via TenantStatusServer  |
|                      +--------------+              |
|                      |              |              |
|                      |              |              |
|                      |              v              |
|                      |        +-----------+        |
|                      |        |           |        |
|                      |        | SQLServer |        |
|                      |        |           |        |
|                      |        +-----------+        |
|                      |                             |
+----------------------v-----------------------------+
                       |
                       |
                       |
                       |   roachpb.SpanStatsResponse
                       |
                       |
 Secondary Tenant      |
+----------------------+------------------------------+
|                      |                              |
|         +------------v-------------+                |
|         |            |             |                |
|         |   kvtenantccl.Connector  |                |
|         |                          |                |
|         +------------+-------------+                |
|                      |                              |
|                      |    via TenantStatusServer    |
|                      +--------------------+         |
|                      |                    |         |
|                      |                    |         |
|                      |                    |         |
|    +-----------------v-------+     +------v------+  |
|    |                         |     |             |  |
|    |    server.statusServer  |     |  SQLServer  |  |
|    |                         |     |             |  |
|    +-------------------------+     +-------------+  |
|                                                     |
|                                                     |
|                                                     |
+-----------------------------------------------------+

Resolves cockroachdb#84105
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note (backward-incompatible change): The SpanStatsRequest message
field 'node_id' has changed from type 'string' to type 'int32'.
zachlite pushed a commit to zachlite/cockroach that referenced this pull request Feb 14, 2023
This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
cockroachdbGH-79700, this assumption is no longer true.

This re-implementation allows secondary tenants to access span stats
via the serverpb.TenantStatusServer interface.

If a roachpb.SpanStatsRequest has a node_id value of "0", instead of a specific
node id, the response will yield cluster-wide values. To achieve this,
the server does a fan-out to nodes that are known to have replicas for the
span requested.

The interaction between tenants is illustrated below:

 System Tenant
+----------------------------------------------------+
|                                                    |
|                                                    |
|                                    KV Node fan-out |
|                                     +----------+   |
|                                     |          |   |
|     +-------------------------------v----+     |   |
|     |                                    |     |   |
|     |      server.systemStatusServer     +-----+   |
|     |                                    |         |
|     +----------------+-------------------+         |
|                      |                             |
|                      |                             |
|                      |     via TenantStatusServer  |
|                      +--------------+              |
|                      |              |              |
|                      |              |              |
|                      |              v              |
|                      |        +-----------+        |
|                      |        |           |        |
|                      |        | SQLServer |        |
|                      |        |           |        |
|                      |        +-----------+        |
|                      |                             |
+----------------------v-----------------------------+
                       |
                       |
                       |
                       |   roachpb.SpanStatsResponse
                       |
                       |
 Secondary Tenant      |
+----------------------+------------------------------+
|                      |                              |
|         +------------v-------------+                |
|         |            |             |                |
|         |   kvtenantccl.Connector  |                |
|         |                          |                |
|         +------------+-------------+                |
|                      |                              |
|                      |    via TenantStatusServer    |
|                      +--------------------+         |
|                      |                    |         |
|                      |                    |         |
|                      |                    |         |
|    +-----------------v-------+     +------v------+  |
|    |                         |     |             |  |
|    |    server.statusServer  |     |  SQLServer  |  |
|    |                         |     |             |  |
|    +-------------------------+     +-------------+  |
|                                                     |
|                                                     |
|                                                     |
+-----------------------------------------------------+

Resolves cockroachdb#84105
Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note: None
craig bot pushed a commit that referenced this pull request Feb 14, 2023
96223: kvserver, server: new implementation of SpanStats suitable for use with coalesced ranges r=zachlite a=zachlite

This commit provides a re-implementation of fetching the MVCC stats
and range count for a given span. The previous implementation relied on
the assumption that 1 range can contain at most 1 table/index. Since
GH-79700, this assumption is no longer true.

This re-implementation allows secondary tenants to access span stats
via the serverpb.TenantStatusServer interface.

If a roachpb.SpanStatsRequest has a node_id value of 0, instead of a specific
node id, the response will yield cluster-wide values. To achieve this,
the server does a fan-out to nodes that are known to have replicas for the
span requested.

The interaction between tenants is illustrated below:

```
 System Tenant
┌────────────────────────────────────────────────────┐
│                                                    │
│                                                    │
│                                    KV Node fan-out │
│                                     ┌──────────┐   │
│                                     │          │   │
│     ┌───────────────────────────────▼────┐     │   │
│     │                                    │     │   │
│     │      server.systemStatusServer     ├─────┘   │
│     │                                    │         │
│     └────────────────┬───────────────────┘         │
│                      │                             │
│                      │                             │
│                      │     via TenantStatusServer  │
│                      ├──────────────┐              │
│                      │              │              │
│                      │              │              │
│                      │              ▼              │
│                      │        ┌───────────┐        │
│                      │        │           │        │
│                      │        │ SQLServer │        │
│                      │        │           │        │
│                      │        └───────────┘        │
│                      │                             │
└──────────────────────▼─────────────────────────────┘
                       │
                       │
                       │
                       │   roachpb.SpanStatsResponse
                       │
                       │
 Secondary Tenant      │
┌──────────────────────┼──────────────────────────────┐
│                      │                              │
│         ┌────────────▼─────────────┐                │
│         │            │             │                │
│         │   kvtenantccl.Connector  │                │
│         │                          │                │
│         └────────────┬─────────────┘                │
│                      │                              │
│                      │    via TenantStatusServer    │
│                      ├────────────────────┐         │
│                      │                    │         │
│                      │                    │         │
│                      │                    │         │
│    ┌─────────────────▼───────┐     ┌──────▼──────┐  │
│    │                         │     │             │  │
│    │    server.statusServer  │     │  SQLServer  │  │
│    │                         │     │             │  │
│    └─────────────────────────┘     └─────────────┘  │
│                                                     │
│                                                     │
│                                                     │
└─────────────────────────────────────────────────────┘
```


Part of: https://cockroachlabs.atlassian.net/browse/CRDB-22711
Release note: None

97082: ui: add latency info to stmt pages r=maryliag a=maryliag

Part Of: #72954

<img width="1373" alt="Screenshot 2023-02-13 at 5 53 57 PM" src="https://user-images.githubusercontent.com/1017486/218592825-922eed84-2559-4902-b291-852958a59ed6.png">

Add p50, p90, p99, max and min latency to
Statement table on SQL Activity page.

Release note (ui change): Add columns p50, p90, p99, max and min latency for Statement table on SQL Activity page.

Co-authored-by: Zach Lite <zach@cockroachlabs.com>
Co-authored-by: maryliag <marylia@cockroachlabs.com>
zachlite added a commit to zachlite/cockroach that referenced this pull request Mar 21, 2023
This is a stop-gap commit that enables the DataDistribution endpoint
to handle the parts of the key space belonging to secondary tenants
without error.

Despite no error, the result returned for secondary tenants is not correct.
The DataDistribution endpoint was written before cockroachdb#79700, and therefore
doesn't know that multiple tables can exist within a range.

Additionally, the results for the system tenant will be incorrect soon
because cockroachdb#81008 is in progress.

Fixes: cockroachdb#97993
Release note: None
craig bot pushed a commit that referenced this pull request Mar 30, 2023
98689: workload: jitter the teardown of connections to prevent thundering herd r=sean- a=sean-

workload: jitter the teardown of connections to prevent thundering herd

This change upgrades workload's use of pgx from v4 to v5 in order to allow jittering the teardown of connections.  This change sets a max connection age of 5min and jitters the teardown by 30s.  Upgrading to pgx v5 also adds non-blocking pgxpool connection acquisition.

workload: add flags to manage the age and lifecycle of connection pool

Add flags to all workload types to specify:

* the max connection age: `--max-conn-lifetime duration`
* the max connection age jitter: `--max-conn-lifetime-jitter duration`
* the max connection idle time: `--max-conn-idle-time duration`
* the connection health check interval: `--conn-healthcheck-period duration`
* the min number of connections in the pool: `--min-conns int`

workload: add support for remaining pgx query modes

Add support for `pgx.QueryExecModeCacheDescribe` and `pgx.QueryExecModeDescribeExec`.  Previously, only three of the five query modes were available.

workload: fix race condition when recording histogram data

Release note (cli change): workload jitters teardown of connections to prevent thundering herd impacting P99 latency results.

Release note (cli change): workload utility now has flags to tune the connection pool used for testing.  See `--conn-healthcheck-period`, `--min-conns`, and the `--max-conn-*` flags for details.

Release note (cli change): workload now supports every [PostgreSQL query mode](https://github.com/jackc/pgx/blob/fa5fbed497bc75acee05c1667a8760ce0d634cba/conn.go#L167-L182) available via the underlying pgx driver.

99142: server: fix `DataDistribution` server error when creating a tenant r=zachlite a=zachlite

This is a stop-gap commit that enables the DataDistribution endpoint to handle the parts of the key space belonging to secondary tenants without error.

Despite no error, the result returned for secondary tenants is not correct. The DataDistribution endpoint was written before #79700, and therefore doesn't know that multiple tables can exist within a range. 

Additionally, the results for the system tenant will be incorrect soon because #81008 is in progress.
Improvements are tracked by #97942

Fixes: #97993
Release note: None

99494: changefeedccl: Do not require rangefeed when running initial scan only. r=miretskiy a=miretskiy

Fixes #99470

Release note: None

Co-authored-by: Sean Chittenden <sean@chittenden.org>
Co-authored-by: zachlite <zachlite@gmail.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this pull request Mar 30, 2023
This is a stop-gap commit that enables the DataDistribution endpoint
to handle the parts of the key space belonging to secondary tenants
without error.

Despite no error, the result returned for secondary tenants is not correct.
The DataDistribution endpoint was written before #79700, and therefore
doesn't know that multiple tables can exist within a range.

Additionally, the results for the system tenant will be incorrect soon
because #81008 is in progress.

Fixes: #97993
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants