Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add mixed version logic tests for role id migrations #92342

Closed
andyyang890 opened this issue Nov 22, 2022 · 0 comments
Closed

sql: add mixed version logic tests for role id migrations #92342

andyyang890 opened this issue Nov 22, 2022 · 0 comments
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@andyyang890
Copy link
Collaborator

andyyang890 commented Nov 22, 2022

Once the mixed version testing work in #91881 is merged, mixed version logic tests should be added for all of the role id migrations.

Jira issue: CRDB-21712

Epic CRDB-19134

@andyyang890 andyyang890 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Nov 22, 2022
craig bot pushed a commit that referenced this issue Feb 2, 2023
96282: logictest: fix user cmd to respect nodeidx r=rafiss a=andyyang890

logictest: change logicTest to store per-node clients for users

This patch changes the `clients` field within `logicTest` from type
`map[string]*gosql.DB` to `map[string]map[int]*gosql.DB` in order to
store per-node clients for each user. This allows the `user` cmd in
logic tests to fetch a client for the specified node instead of the
previously used node.

Release note: None

----

logictest: move paragraph in comment that got separated

Release note: None

----

logictest: fix user cmd to respect nodeidx for cockroach-go testserver

This patch fixes the client creation for logic tests that use the
cockroach-go testserver to use the specified nodeidx when connecting.

Release note: None

----

Informs #92342

96345: pgwire,server: clarify to SQL clients when they select the wrong tenant r=rafiss a=knz

Fixes #92525.
Epic: CRDB-14537

Prior to this patch:
```
$ ./cockroach sql -d cluster:wo
ERROR: server closed the connection.
Is this a CockroachDB node?
unexpected EOF
```

After this patch:
```
$ ./cockroach sql -d cluster:woo
ERROR: service unavailable for target tenant (woo)
SQLSTATE: 08000
HINT: Double check your "-ccluster=" connection parameter or your "cluster:" database name prefix.
```

Release note: None

Co-authored-by: Andy Yang <yang@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
craig bot pushed a commit that referenced this issue Feb 3, 2023
96388: logictest: Skip if using cockroach-go/testserver r=renatolabs a=itsbilal

Skip of flaky test.

Informs: #96387

Epic: none

Release note: None

96452: roachtest: add aws weekly script r=rickystewart a=healthy-pod

This change adds the scripts to be used by the aws weekly roachtest teamcity job.

Release note: None
Epic: none

96469: logictest: refactor client connection creation logic r=rafiss a=andyyang890

This patch abstracts the logic for creating client connections and
persisting them into a separate function. As a side benefit, it also
fixes the problem of clients created in `execQuery` not being properly
tracked in the `clients` field of `logicTest`.

Informs #92342

Release note: None

96500: authors: add BabuSrithar to authors r=healthy-pod a=BabuSrithar

Release note: None
Epic: None

Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: healthy-pod <ahmad@cockroachlabs.com>
Co-authored-by: Andy Yang <yang@cockroachlabs.com>
Co-authored-by: babusrithar <babusrithar@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Feb 9, 2023
96031: sql: add mixed version test for system.role_members user ids upgrade r=rafiss a=andyyang890

This patch adds a mixed version logictest that ensures that GRANT ROLE
continues to work properly in a cluster with both 22.2 and 23.1 nodes
(i.e. nodes that have run the system.role_members user ids upgrade).

Part of #92342

Release note: None

96127: kvserver: introduce cpu rebalancing r=nvanbenschoten a=kvoli

This patch allows the store rebalancer to use CPU in place of QPS when
balancing load on a cluster. This patch adds `cpu` as an option with the
cluster setting:

`kv.allocator.load_based_rebalancing.objective`

When set to `cpu`, rather than `qps`. The store rebalancer will perform
a mostly identical function, however target balancing the sum of all
replica's cpu time on each store, rather than qps. The default remains
as `qps` here.

Similar to QPS, the rebalance threshold can be set to allow controlling
the range above and below the mean store CPU is considered imbalanced,
either overfull or underfull respectively:

`kv.allocator.cpu_rebalance_threshold`: 0.1

In order to manage with mixed versions during upgrade and some
architectures not supporting the cpu sampling method, a rebalance
objective manager is introduced in `rebalance_objective.go`. The manager
mediates access to the rebalance objective and overwrites it in cases
where the objective set in the cluster setting cannot be supported.

The results when using CPU in comparison to QPS can be found [here](https://docs.google.com/document/d/1QLhD20BTamjj3-dSG9F1gW7XMBy9miGPpJpmu2Dn3yo/edit#) (internal).


<details>
<summary>Results Summary</summary>

![image](https://user-images.githubusercontent.com/39606633/215580650-b12ff509-5cf5-4ffa-880d-8387e2ef0afa.png)

![image](https://user-images.githubusercontent.com/39606633/215580626-3d748ba1-e9a4-4abb-8acd-2c319203932e.png)

![image](https://user-images.githubusercontent.com/39606633/215580585-58e6000d-b6cf-430a-b4b7-d14a77eab3bd.png)

</details>


<details>
<summary>Detailed Allocbench Results</summary>

```
kv/r=0/access=skew
master
    median cost(gb):05.81 cpu(%):14.97 write(%):37.83
    stddev cost(gb):01.87 cpu(%):03.98 write(%):07.01
cpu rebalancing
    median cost(gb):08.76 cpu(%):14.42 write(%):36.61
    stddev cost(gb):02.66 cpu(%):01.85 write(%):04.80
kv/r=0/ops=skew
master
    median cost(gb):06.23 cpu(%):26.05 write(%):57.33
    stddev cost(gb):02.92 cpu(%):05.83 write(%):08.20
cpu rebalancing
    median cost(gb):04.28 cpu(%):11.45 write(%):31.28
    stddev cost(gb):02.25 cpu(%):02.51 write(%):06.68
kv/r=50/ops=skew
master
    median cost(gb):04.36 cpu(%):22.84 write(%):48.09
    stddev cost(gb):01.12 cpu(%):02.71 write(%):05.51
cpu rebalancing
    median cost(gb):04.64 cpu(%):13.49 write(%):43.05
    stddev cost(gb):01.07 cpu(%):01.26 write(%):08.58
kv/r=95/access=skew
master
    median cost(gb):00.00 cpu(%):09.51 write(%):01.24
    stddev cost(gb):00.00 cpu(%):01.74 write(%):00.27
cpu rebalancing
    median cost(gb):00.00 cpu(%):05.66 write(%):01.31
    stddev cost(gb):00.00 cpu(%):01.56 write(%):00.26
kv/r=95/ops=skew
master
    median cost(gb):0.00 cpu(%):47.29 write(%):00.93
    stddev cost(gb):0.09 cpu(%):04.30 write(%):00.17
cpu rebalancing
    median cost(gb):0.00 cpu(%):08.16 write(%):01.30
    stddev cost(gb):0.01 cpu(%):04.59 write(%):00.20
```


</details>

resolves: #95380

Release note (ops change) Add option to balance cpu time (cpu)
instead of queries per second (qps) among stores in a cluster. This is
done by setting `kv.allocator.load_based_rebalancing.objective='cpu'`.
`kv.allocator.cpu_rebalance_threshold` is also added, similar to
`kv.allocator.qps_rebalance_threshold` to control the target range for
store cpu above and below the cluster mean.

96440: ui: add execution insights to statement and transaction fingerprint details r=ericharmeling a=ericharmeling

This commit adds execution insights to the Statement Fingerprint and Transaction Fingerprint Details pages.

Part of #83780.

Loom: https://www.loom.com/share/98d2023b672e43fa8016829aa641a829

Note that the SQL queries against the `*_execution_insights` tables are updated to `SELECT DISTINCT ON (*_fingerprint_id, problems, causes)` (equivalent to `GROUP BY (*_fingerprint_id, problems, causes)`) from the latest results in the tables, rather than `row_number() OVER ( PARTITION BY stmt_fingerprint_id, problem, causes ORDER BY end_time DESC ) AS rank... WHERE rank = 1`. Both patterns return the same result, but one uses aggregation and the other uses a window function. I find the `DISTINCT ON/GROUP BY` pattern easier to understand, I'm not seeing much difference in the planning/execution time between the two over the same set of data, and I'm seeing `DISTINCT ON/GROUP BY` coming up as more performant in almost all the secondary sources I've encountered.

Release note (ui change): Added execution insights to the Statement Fingerprint Details and Transaction Fingerprint Details Pages.

96828: collatedstring: support default, C, and POSIX in expressions r=otan a=rafiss

fixes #50734
fixes #95667
informs #57255

---
### collatedstring: create new package 

Move the small amount of code from tree/collatedstring.go

---

### collatedstring: support C and POSIX in expressions

Release note (sql change): Expressions of the form `COLLATE "default"`,
`COLLATE "C"`, and `COLLATE "POSIX"` are now supported. Since the
default collation cannot be changed currently, these expressions are all
equivalent. The expressions are evaluated by treating the input as a
normal string, and ignoring the collation.

This means that comparisons between strings and collated strings that
use "default", "C", or "POSIX" are now supported.

Creating a column with the "C" or "POSIX" collations is still not
supported.

96870: kvserver: use replicasByKey addition func in snapshot path r=tbg a=pavelkalinnikov

This commit makes one step towards better code sharing between `Replica` initialization paths: split trigger and snapshot application. It makes both to use the same method to check and insert the initialized `Replica` to `replicasByKey` map.

Touches #94912

96874: roachtest: run scheduled backup only on clusters with enterprise license r=stevendanna a=msbutler

Epic: none

Release note: None

96883: go.mod: bump Pebble to 829675f94811 r=RaduBerinde a=RaduBerinde

829675f9 db: fix ObsoleteSize stat
2f086b74 db: refactor compaction splitting to reduce key comparisons

Release note: None
Epic: none

Co-authored-by: Andy Yang <yang@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Mar 11, 2023
98401: sql: deflake mixed version user ID migration logic tests r=adityamaru a=andyyang890

This patch adds a sleep directive into the mixed version user ID
migration logic tests to reduce the flakiness caused by the cluster
upgrades not finishing quickly enough.

Informs #92342

Release note: None

Co-authored-by: Andy Yang <yang@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Mar 30, 2023
99533: sql,backupccl: add more tests for user ID migrations r=rafiss a=andyyang890

**sql: add mixed version test for system.privileges user ID migration**

This patch adds a mixed version logic test that ensures granting of
system privileges continues to work properly in a cluster that has
both 22.2 and 23.1 nodes. The relevant version gate being tested here
is V23_1SystemPrivilegesTableHasUserIDColumn.

Release note: None

---

**sql: add mixed version test for system.database_role_settings user IDs**

This patch adds a mixed version logic test that ensures setting default
session variables continues to work properly in a cluster that has
both 22.2 and 23.1 nodes. The relevant version gate being tested here
is V23_1DatabaseRoleSettingsHasRoleIDColumn.

Release note: None

---

**backupccl: update TestRestoreOldVersions subtest for system.privileges**

This patch updates the TestRestoreOldVersions subtest for the
system.privileges table to also test that a row for the public
role is correctly restored.

Release note: None

---

Part of #87079 
Part of #92342

99761: sql/pgwire: buffer messages during COPY TO and startup r=rafiss a=rafiss

fixes #97314
backport fixes #99546

---

### sql: buffer COPY OUT data 

Rather than sending each COPY result row one-by-one, now the data will
get buffered, then flushed when the buffer size limit is reached or when
sending ReadyForQuery.

This fixes an issue that was causing the ruby-pg test to hang, since it assumes
the data will be buffered.

---

### pgwire: buffer startup messages when creating connection

This avoids sending each ParameterStatus one-by-one.

---

### sql: refactor CopyIn handling

This makes it so we don't need to manually send a CommandComplete.
Instead, when the CopyInResult is closed, CommandComplete will be sent,
similar to how it works for other message types.

Release note: None

99953: sql/schemachanger: address bugs with column families   r=fqazi a=fqazi

This PR addresses the following bugs with column families:

1. On master and 23.1 after the removal of oprules, we have scenarios where not implemented assertions can be hit for column families when rollbacks occur. These changes add a more concrete assertion that just ensures that the column family is cleaned up, and rules to ensure appropriate sequencing for removal.
2. When UPDATE/INSERTs were executed concurrently while adding a new column family, we could end up writing to the old primary key with the new column family. In happy path cases where everything was successful, this didn't matter, but if a rollback occurred we would have values left in the old primary index that runtime couldn't handle.
3. We had no way of validating DML with concurrent schema changes in cases with rollbacks, these modifications add tests and the framework required this case.

Release note (bug fix): Concurrent DML while adding
a new column with a new column family can lead to
corruption in the existing primary index. If a rollback
occurs the table may no longer be accessible.

Co-authored-by: Andy Yang <yang@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

1 participant