Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: future proof tests for span configs #74265

Merged
merged 13 commits into from
Jan 13, 2022

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Dec 24, 2021

This PR is a (mostly) test-only change that paves the way for enabling span
configs by default (#73876). Various tests throughout CRDB rely on timing
assumptions with respect to how quickly splits are induced or configs applied,
assumptions that are no longer true with the span configs infrastructure
because:

  • updated descriptors/zone configs make its way to the global span config
    state system.span_configurations asynchronously, and
  • kv learns about learns about these updates asynchronously.

Consider TestTruncatePreservesSplitPoints for example, which relied on range
split decisions to happen near-instantaneously (especially for the single node
variant) -- a fair assumption with the gossiped SystemConfigSpan, but no longer
applicable.

Another example is TestChangefeedProtectedTimestamps, where we use an
extremely low gc.ttlseconds. Given span configs makes use of rangefeeds
internally, feeds that error out if started at timestamps already GC-ed, we
need to ensure a sufficiently lax GC TTL.

For most of these tests, making them agnostic to either subsystem is just a
matter of throwing in a SucceedsSoon block at the right places. See individual
commits/commentary for the individual test changes.

Release note: None


First commit is from #74171 and should be reviewed there.

@irfansharif irfansharif requested a review from a team December 24, 2021 00:25
@irfansharif irfansharif requested review from a team as code owners December 24, 2021 00:25
@irfansharif irfansharif requested review from a team, stevendanna, nvanbenschoten, ajwerner and arulajmani and removed request for a team and stevendanna December 24, 2021 00:25
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif changed the title *: future proof tests for the span configs infrastructure *: future proof tests for span config Dec 24, 2021
@irfansharif irfansharif changed the title *: future proof tests for span config *: future proof tests for span configs Dec 24, 2021
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 14 of 14 files at r1, 14 of 14 files at r2, 1 of 1 files at r3, 4 of 4 files at r4, 1 of 1 files at r5, 2 of 2 files at r6, 1 of 1 files at r7, 1 of 1 files at r8, 1 of 1 files at r9, 1 of 1 files at r10, 1 of 1 files at r11, 1 of 1 files at r12, 1 of 1 files at r13, 1 of 1 files at r14, 2 of 2 files at r15, 2 of 2 files at r16, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, and @irfansharif)


-- commits, line 51 at r5:
s/the/then/


pkg/ccl/changefeedccl/changefeed_test.go, line 3582 at r12 (raw file):

			sqlDB := sqlutils.MakeSQLRunner(db)
			sqlDB.Exec(t, `ALTER RANGE default CONFIGURE ZONE USING gc.ttlseconds = 100`)
			sqlDB.Exec(t, `ALTER RANGE system CONFIGURE ZONE USING gc.ttlseconds = 100`)

Why was the test dropping the gc.ttl so low? Does bumping it up make the test pass trivially even if a protected timestamp is not written?


pkg/ccl/multiregionccl/roundtrips_test.go, line 116 at r10 (raw file):

		conn := tc.ServerConn(i)
		isLeaseHolder := false
		testutils.SucceedsSoon(t, func() error {

This seems like the wrong place to wait for the split. Should we do it right after creating the table?

We're going to start to use this infrastructure for realsies soon.

Release note: None
We were previously using an empty one with no control at the caller to
override specific values. This comes in handy when looking to control
knobs for all tenants in these tests.

Release note: None
In a future commit we'll enable the span configs infrastructure by
default for all crdb unit tests. Doing so will surface the need to have
the reconciliation job disabled for the specific migration tests that
assert on the contents of `system.span_configurations` (also written to
by the reconciliation job/reconciler).

Release note: None
This is a follow up to cockroachdb#73531, there we forgot to update package tests
to also use consistent reads when looking up descriptors. Looking at the
surrounding commentary in these datadriven files and comparing against
the actual results, there was a mismatch (now no longer so).

Release note: None
This test previously relied on range split decisions to happen
near-instantaneously (especially for the single node variant). This was
a fair assumption with the gossiped SystemConfigSpan, but is no longer
true with the span configs infrastructure where
(i)  updated descriptors/zone configs make its way to
     `system.span_configurations` asynchronously, and
(ii) KV learns about learns about `system.span_configurations` updates
     asynchronously.

We update the test to be agnostic to either subsystem (tl;dr - throw in
a SucceedsSoon block at the right places).

Release note: None
This is Yet Another Test that made timing assumptions on how
instantaneously range split decisions appear, assumptions that no longer
hold under the span configs infrastructure. Adding compatibility is a
matter of waiting for splits to appear instead of just expecting it.

Release note: None
This is Yet Another Test that made timing assumptions on how
instantaneously range split decisions appear, assumptions that don't
hold under the span configs infrastructure. Adding compatibility is a
matter of waiting for splits to appear instead of only expecting it.

Release note: None
This is Yet Another Test that made timing assumptions on how
instantaneously range split decisions appear, assumptions that don't
hold under the span configs infrastructure. Adding compatibility is a
matter of waiting for splits to appear instead of only expecting it.

Release note: None
This is Yet Another Test that made timing assumptions on how
instantaneously range split decisions appear, assumptions that don't
hold under the span configs infrastructure. Adding compatibility is a
matter of waiting for splits to appear instead of only expecting it.

Release note: None
The low gc.ttlseconds in this test that applies to
system.{descriptor,zones}, when run with span configs enabled (in a
future commit), runs into errors introduced in cockroachdb#73086. The span configs
infrastructure makes use of rangefeeds against these tables within the
spanconfig.SQLWatcher process. These rangefeeds error out if the
timestamp they're started with is already GC-ed, something that's very
likely with low GC TTLs. To accommodate, we simply bump the TTL to a
more reasonable 100s.

Release note: None
This is Yet Another Test that made timing assumptions on how
instantaneously range config decisions are applied, assumptions that
don't hold under the span configs infrastructure. Adding compatibility
is a matter of waiting for the right configs to appear instead of only
expecting it.

Release note: None
This testing knob was added in cockroachdb#68374 but I'm not sure that it was
necessary? Brief stress runs with an without this flag did not surface
anything. In a future commit where we enable span configs by default,
we'll actually rely on the reconciliation job running, so we get rid of
this flag now.

Release note: None
Instead of using hooks that directly mutate the system config span,
using SQL statements to tweak zone configs future proofs this test for
compatibility with the span configs infrastructure.

Release note: None
Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! Waiting for green.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, and @nvanbenschoten)


pkg/ccl/changefeedccl/changefeed_test.go, line 3582 at r12 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why was the test dropping the gc.ttl so low? Does bumping it up make the test pass trivially even if a protected timestamp is not written?

I'm not really sure, @ajwerner perhaps? (introduced in 84ce391.)

The test does assert on the protection record existing/not existing, and I don't we're executing anything that would cause GC to occur. My guess is to speed up the test? But don't know about the internals to say for sure.


pkg/ccl/multiregionccl/roundtrips_test.go, line 116 at r10 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This seems like the wrong place to wait for the split. Should we do it right after creating the table?

It's not just splits here, it's also the configs that the test is asserting on. (I had tried waiting for just splits earlier, but the config propagation itself was unaccounted for.)

We don't have good testing library APIs that let you wait for a config to materialize in this way, and short of introducing introducing them, this felt more targeted. I think we'll soon want more of these APIs with the additional amount of asynchrony this infra bakes in.

@irfansharif
Copy link
Contributor Author

bors r+

@irfansharif
Copy link
Contributor Author

CI failure is an existing flake: #74819.

@craig
Copy link
Contributor

craig bot commented Jan 13, 2022

Build succeeded:

@craig craig bot merged commit ca44523 into cockroachdb:master Jan 13, 2022
@irfansharif irfansharif deleted the 211223.future-proof-tests branch January 13, 2022 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants