closedts: shorten target_duration from 30s to 5s #39643

ajwerner · 2019-08-13T22:22:15Z

This PR dramatically shortens the closed timestamp target interval. With this
setting the experimental_follower_read_timestamp() will now be 8.5 seconds in
the past as opposed to the current 48 seconds in the past. This relatively
aggressive setting of 5s is intended to shake out issues as we head into the stability
period.

This value has been verified to work on the schemachange/mixed/tpcc roachtest
introduced in #39096 and for vanilla TPC-C runs around where we have
established a baseline in the past using tpccbench. Specifically several runs
at 2300 warehouses on 3x c5d.4xlarge nodes which is right at the passing
boundary without the change were run only a very small difference in
efficiency or tail latency was observed.

It seems reasonable to attempt to live with this value on master for a while
and see what happens.

Fixes #37083.

Release note: None

cockroach-teamcity · 2019-08-13T22:22:22Z

This change is

ajwerner · 2019-08-13T22:22:25Z

cc @awoods187

awoods187 · 2019-08-13T22:26:13Z

Awesome!

tbg · 2019-08-14T06:51:07Z

Are we concerned about transactions taking more than 5s retrying endlessly? This was a problem previously, though only with 48s+ transaction. Users' batch jobs may go haywire. Arguably if the closed timestamp prevents a txn once, it will do so again (in the absence of contention at least).

tbg · 2019-08-14T09:11:26Z

Also, please run a few manual roachprod-stressrace invocations. This sort of change is probably going to give us some fallout, and since master already is a bit flaky we should weed that out before.

ajwerner · 2019-08-14T13:18:45Z

Are we concerned about transactions taking more than 5s retrying endlessly? This was a problem previously, though only with 48s+ transaction. Users' batch jobs may go haywire. Arguably if the closed timestamp prevents a txn once, it will do so again (in the absence of contention at least).

Yes, IIUC that's true for transactions which observe the clock. For transactions which don't, shouldn't they get pushed and then refresh, right?

Also, please run a few manual roachprod-stressrace invocations. This sort of change is probably going to give us some fallout, and since master already is a bit flaky we should weed that out before.

TestNodeLivenessSetDecommissioning fails on master and also this branch almost instantly:

I190814 13:16:32.434790 1641 storage/store_snapshot.go:786  [replicate,s1,r6/1:/Table/{SystemCon…-11}] sending LEARNER snapshot 0243daa0 at applied index 16                                                                                                                   
I190814 13:16:32.435061 1641 storage/store_snapshot.go:829  [replicate,s1,r6/1:/Table/{SystemCon…-11}] streamed snapshot to (n2,s2):2: kv pairs: 44, log entries: 0, rate-limit: 8.0 MiB/sec, 0.00s                                                                            
I190814 13:16:32.435720 2327 storage/replica_raftstorage.go:823  [s2,r6/2:{-}] applying LEARNER snapshot at index 16 (id=0243daa0, encoded size=8206, 1 rocksdb batches, 0 log entries)                                                                                        
I190814 13:16:32.436232 2327 storage/replica_raftstorage.go:829  [s2,r6/2:/Table/{SystemCon…-11}] applied LEARNER snapshot in 0ms [clear=0ms batch=0ms entries=0ms commit=0ms]                                                                                                 
I190814 13:16:32.437799 1641 storage/replica_command.go:1198  [replicate,s1,r6/1:/Table/{SystemCon…-11}] change replicas (ADD_REPLICA (n2,s2):2): existing descriptor r6:/Table/{SystemConfigSpan/Start-11} [(n1,s1):1, (n2,s2):2LEARNER, next=3, gen=1]

I'll wait until the stress passes on critical packages before attempting to merge this.

ajwerner · 2019-08-14T13:43:18Z

The test failure seems to not be directly related to learners as something similar happens when learners are disabled. I'm bisecting before typing up an issue.

I190814 13:38:29.275572 4251 storage/node_liveness.go:474  [liveness-hb] heartbeat failed on epoch increment; retrying
I190814 13:38:29.275719 2646 storage/store_snapshot.go:786  [replicate,s1,r6/1:/Table/{SystemCon…-11}] sending PREEMPTIVE snapshot 4e31ebe4 at applied index 13
I190814 13:38:29.275901 2646 storage/store_snapshot.go:829  [replicate,s1,r6/1:/Table/{SystemCon…-11}] streamed snapshot to (n2,s2):?: kv pairs: 42, log entries: 0, rate-limit: 8.0 MiB/sec, 0.01s
I190814 13:38:29.276288 4014 storage/replica_raftstorage.go:823  [s2,r6/?:{-}] applying PREEMPTIVE snapshot at index 13 (id=4e31ebe4, encoded size=8023, 1 rocksdb batches, 0 log entries)
I190814 13:38:29.277049 4014 storage/replica_raftstorage.go:829  [s2,r6/?:/Table/{SystemCon…-11}] applied PREEMPTIVE snapshot in 1ms [clear=0ms batch=0ms entries=0ms commit=1ms]
I190814 13:38:29.277333 2646 storage/replica_command.go:1198  [replicate,s1,r6/1:/Table/{SystemCon…-11}] change replicas (ADD_REPLICA (n2,s2):2): existing descriptor r6:/Table/{SystemConfigSpan/Start-11} [(n1,s1):1, next=2, gen=0]
I190814 13:38:29.278707 2646 storage/replica_raft.go:291  [replicate,s1,r6/1:/Table/{SystemCon…-11},txn=4a2f147a] proposing ADD_REPLICA[(n2,s2):2]: after=[(n1,s1):1 (n2,s2):2] next=3
I190814 13:38:29.282106 2646 storage/store_snapshot.go:786  [replicate,s1,r6/1:/Table/{SystemCon…-11}] sending PREEMPTIVE snapshot 675e4e73 at applied index 16
I190814 13:38:29.296762 3979 storage/replica_raftstorage.go:823  [s3,r6/?:{-}] applying PREEMPTIVE snapshot at index 16 (id=675e4e73, encoded size=8204, 1 rocksdb batches, 0 log entries)
I190814 13:38:29.297540 3979 storage/replica_raftstorage.go:829  [s3,r6/?:/Table/{SystemCon…-11}] applied PREEMPTIVE snapshot in 1ms [clear=0ms batch=0ms entries=0ms commit=1ms]
I190814 13:38:29.297944 2646 storage/store_snapshot.go:829  [replicate,s1,r6/1:/Table/{SystemCon…-11}] streamed snapshot to (n3,s3):?: kv pairs: 44, log entries: 0, rate-limit: 8.0 MiB/sec, 0.02s
I190814 13:38:29.298187 2646 storage/replica_command.go:1198  [replicate,s1,r6/1:/Table/{SystemCon…-11}] change replicas (ADD_REPLICA (n3,s3):3): existing descriptor r6:/Table/{SystemConfigSpan/Start-11} [(n1,s1):1, (n2,s2):2, next=3, gen=1]
F190814 13:38:29.303312 4123 storage/replica_raft.go:668  [s2,r6/2:{-}] setting non-zero HardState.Commit on uninitialized replica [n2,s2,r6/2:{-}]. HS={Term:6 Vote:0 Commit:16 XXX_unrecognized:[]}
goroutine 4123 [running]:

tbg · 2019-08-14T14:11:11Z

Yes, IIUC that's true for transactions which observe the clock. For transactions which don't, shouldn't they get pushed and then refresh, right?

Yes, though if the txn takes more than 5s, the refresh may fail. Definitely txns which don't read are safe.

The test failure seems to not be directly related to learners as something similar happens when learners are disabled. I'm bisecting before typing up an issue.

Uhoh, that doesn't look good. Curious what you'll find.

tbg · 2019-08-14T14:12:04Z

(My hope is that this is #39604)

ajwerner · 2019-08-14T14:16:43Z

(My hope is that this is #39604)

I don't think so. This reproduces as far back as a8d548f which is months old. The bisecting continues.

ajwerner · 2019-08-23T02:43:21Z

I've been stressing things and saw:

    --- FAIL: TestRangeTransferLeaseExpirationBased/DrainTransfer (0.17s)
        client_replica_test.go:568: change replicas of r1 failed: descriptor changed: [expected] r1:/M{in-ax} [(n1,s1):1, (n2,s2):2LEARNER, next=3, gen=1] != [actual] r1:/M{in-ax} [(n1,s1):1, next=3, gen=2]

Which I don't think is related.

Stressing it a while longer and then I'm going to bors this. Please don't hesitate to revert it.

While stressing storage for cockroachdb#39643 I encountered TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic failing under stress. It also complained about the parent testing.T having been failed when then using a child so fixed that too though in a perhaps messy way. ``` --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic (2.29s) --- PASS: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/initial_run (0.14s) client_test.go:1280: [NotLeaseHolderError] r1: replica (n1,s1):1 not lease holder; current lease is repl=(n3,s3):3 seq=4 start=1566532897.322355813,1 exp=1566532898.378802707,0 pro=1566532897.478824123,0 --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/after_restart (1.02s) testing.go:820: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test ``` Release note: None

ajwerner · 2019-08-23T05:20:34Z

--- FAIL: TestStoreRangeMergeDuringShutdown (0.15s)
    client_merge_test.go:3209: expected "failed to send RPC: sending to all 1 replicas failed; last error: (err: node unavailable; try another peer) <nil>" error, but got not lease holder

I also stressed master just to see what happens and observed some other flakes:

    --- FAIL: TestRangeTransferLeaseExpirationBased/DrainTransferWithExtension (0.17s)
        client_replica_test.go:568: change replicas of r1 failed: descriptor changed: [expected] r1:/M{in-ax} [(n1,s1):1, (n2,s2):2LEARNER, next=3, gen=1] != [actual] r1:/M{in-ax} [(n1,s1):1, next=3, gen=2]

and

--- FAIL: TestInitRaftGroupOnRequest (0.58s)
    client_raft_test.go:3903: expected NotLeaseHolderError; got <nil>

and

--- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic (2.29s)
    --- PASS: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/initial_run (0.14s)
    client_test.go:1280: [NotLeaseHolderError] r1: replica (n1,s1):1 not lease holder; current lease is repl=(n3,s3):3 seq=4 start=1566532897.322355813,1 exp=1566532898.378802707,0 pro=1566532897.478824123,0
    --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/after_restart (1.02s)
        testing.go:820: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test

(#39838)

Stressing storage and fixing the various rare failures would be good. In the meantime it's not clear that this made anything worse. If it becomes apparent that's not true, please do revert it.

bors r+

craig · 2019-08-23T05:28:37Z

Build failed

GitHub CI (Cockroach)

ajwerner · 2019-08-23T05:35:18Z

Fixed a constant in the follower read test.

bors r+

craig · 2019-08-23T05:44:21Z

Build failed

GitHub CI (Cockroach)

This PR dramatically shortens the closed timestamp target interval. With this setting the experimental_follower_read_timestamp() will now be 3.7 seconds in the past as opposed to the current 48 seconds in the past. This relatively aggressive setting of 5s is intended to shake out issues. This value has been verified to work on the `schemachange/mixed/tpcc` roachtest introduced in cockroachdb#39096 and for vanilla TPC-C runs around where we have established a baseline in the past using tpccbench. Specifically several runs at 2300 warehouses on 3x c5d.4xlarge nodes which is right at the passing boundary without the change were run only a very small difference in efficiency or tail latency was observed. It seems reasonable to attempt to live with this value on master for a while and see what happens. Fixes cockroachdb#37083. Release note: None

ajwerner · 2019-08-23T05:49:25Z

Generated the settings html file.

bors r+

craig · 2019-08-23T06:05:08Z

Build failed

GitHub CI (Cockroach)

ajwerner · 2019-08-23T06:16:34Z

Flaked on #39610

bors r+

39643: closedts: shorten target_duration from 30s to 5s r=ajwerner a=ajwerner This PR dramatically shortens the closed timestamp target interval. With this setting the experimental_follower_read_timestamp() will now be 8.5 seconds in the past as opposed to the current 48 seconds in the past. This relatively aggressive setting of 5s is intended to shake out issues as we head into the stability period. This value has been verified to work on the `schemachange/mixed/tpcc` roachtest introduced in #39096 and for vanilla TPC-C runs around where we have established a baseline in the past using tpccbench. Specifically several runs at 2300 warehouses on 3x c5d.4xlarge nodes which is right at the passing boundary without the change were run only a very small difference in efficiency or tail latency was observed. It seems reasonable to attempt to live with this value on master for a while and see what happens. Fixes #37083. Release note: None Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>

craig · 2019-08-23T06:41:49Z

Build succeeded

GitHub CI (Cockroach)

While stressing storage for cockroachdb#39643 I encountered TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic failing under stress. It also complained about the parent testing.T having been failed when then using a child so fixed that too though in a perhaps messy way. ``` --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic (2.29s) --- PASS: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/initial_run (0.14s) client_test.go:1280: [NotLeaseHolderError] r1: replica (n1,s1):1 not lease holder; current lease is repl=(n3,s3):3 seq=4 start=1566532897.322355813,1 exp=1566532898.378802707,0 pro=1566532897.478824123,0 --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/after_restart (1.02s) testing.go:820: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test ``` Release note: None

39832: changefeedccl: allow base64-encoded client certificate r=rolandcrosby a=rolandcrosby Adds `client_cert` and `client_key` options to the `kafka://` changefeed URI scheme. Works like the existing `ca_cert` option: the user base64-encodes the contents of a PEM certificate and private key, and passes those base64 values as parameters in the Kafka URI. Fixes #39817. Release note (enterprise change): Client certificates are now supported for Kafka changefeed authentication. 39838: storage: fix stress flake r=ajwerner a=ajwerner While stressing storage for #39643 I encountered TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic failing under stress. It also complained about the parent testing.T having been failed when then using a child so fixed that too though in a perhaps messy way. ``` --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic (2.29s) --- PASS: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/initial_run (0.14s) client_test.go:1280: [NotLeaseHolderError] r1: replica (n1,s1):1 not lease holder; current lease is repl=(n3,s3):3 seq=4 start=1566532897.322355813,1 exp=1566532898.378802707,0 pro=1566532897.478824123,0 --- FAIL: TestDefaultConnectionDisruptionDoesNotInterfereWithSystemTraffic/after_restart (1.02s) testing.go:820: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test ``` Release note: None Co-authored-by: Roland Crosby <roland@cockroachlabs.com> Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>

This reverts commit 3e97e57. The last couple of weeks of experience show that tests become flakey with the target_duration set to 5s. This isn't overly surprising given that updates to table descriptors observe their own timestamp and thus can never be refreshed. See https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sqlbase/structured.go#L1496-L1509 References cockroachdb#39643. Release note: None

40527: Revert "closedts: shorten target_duration from 30s to 5s" r=andy-kimball a=ajwerner This reverts commit 3e97e57. The last couple of weeks of experience show that tests become flakey with the target_duration set to 5s. This isn't overly surprising given that updates to table descriptors observe their own timestamp and thus can never be refreshed. See https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sqlbase/structured.go#L1496-L1509 References #39643. Release note: None Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>

ajwerner requested a review from nvanbenschoten August 13, 2019 22:22

nvanbenschoten approved these changes Aug 13, 2019

View reviewed changes

ajwerner mentioned this pull request Aug 14, 2019

storage: TestNodeLivenessSetDecommissioning fatals under stress #39660

Closed

ajwerner mentioned this pull request Aug 23, 2019

storage: fix stress flake #39838

Merged

ajwerner force-pushed the ajwerner/shorten-closed-ts-target-interval branch from 9c29e06 to a3d077c Compare August 23, 2019 05:34

ajwerner force-pushed the ajwerner/shorten-closed-ts-target-interval branch from a3d077c to 3e97e57 Compare August 23, 2019 05:48

craig bot merged commit 3e97e57 into cockroachdb:master Aug 23, 2019

nvanbenschoten mentioned this pull request Sep 5, 2019

teamcity: failed test: TestTxnCoordSenderRetries #40466

Closed

ajwerner mentioned this pull request Sep 5, 2019

teamcity: failed test: TestLogic #40504

Closed

ajwerner mentioned this pull request Sep 5, 2019

Revert "closedts: shorten target_duration from 30s to 5s" #40527

Merged

ajwerner mentioned this pull request Sep 8, 2019

sql,sqlbase,client: bootstrap TableDescriptor timestamps from MVCC #40581

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

closedts: shorten target_duration from 30s to 5s #39643

closedts: shorten target_duration from 30s to 5s #39643

ajwerner commented Aug 13, 2019

cockroach-teamcity commented Aug 13, 2019

ajwerner commented Aug 13, 2019

awoods187 commented Aug 13, 2019

tbg commented Aug 14, 2019

tbg commented Aug 14, 2019

ajwerner commented Aug 14, 2019

ajwerner commented Aug 14, 2019

tbg commented Aug 14, 2019

tbg commented Aug 14, 2019

ajwerner commented Aug 14, 2019

ajwerner commented Aug 23, 2019

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

closedts: shorten target_duration from 30s to 5s #39643

closedts: shorten target_duration from 30s to 5s #39643

Conversation

ajwerner commented Aug 13, 2019

cockroach-teamcity commented Aug 13, 2019

ajwerner commented Aug 13, 2019

awoods187 commented Aug 13, 2019

tbg commented Aug 14, 2019

tbg commented Aug 14, 2019

ajwerner commented Aug 14, 2019

ajwerner commented Aug 14, 2019

tbg commented Aug 14, 2019

tbg commented Aug 14, 2019

ajwerner commented Aug 14, 2019

ajwerner commented Aug 23, 2019

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

Build failed

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

Build failed

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

Build failed

ajwerner commented Aug 23, 2019

craig bot commented Aug 23, 2019

Build succeeded