jepsen-monotonic failure when using RootTxn on the gateway #41424

andreimatei · 2019-10-08T01:11:57Z

#41307 changed how DistSQL uses transactions. It's first commit 29e18f0 makes a seemingly innocuous change about using RootTxns on the gateway more often than before. This change causes the monotonic jepsen test to fail. It fails about 50% with no nemesis, and always with some of the nemeses (e.g. majority-ring).

The test essentially uses two tables and runs transactions that read the maximum from both tables, increment by one, and write to one of the tables that maximum and the cluster_logical_timestamp(). It then verifies that both the maximum values and the timestamps increase monotonically. But lo and behold, they're no longer monotonic.

I have a theory about the cause: I believe it's because of this check intended to prevent refreshes when cluster_logical_timestamp() (or other things that lock a txn's timestamp) has been used

cockroach/pkg/roachpb/data.go

Line 1284 in e3346c1

if txn == nil || txn.OrigTimestampWasObserved {

I believe the check is looking at the wrong txn proto. It's looking at a proto coming from a pErr, but it's unlikely for that one to have the OrigTimestampWasObserved bit set. That bit is probably only set on the client, not the server.
If this theory is correct, this means that we've always had this bug, but #41307 exposed it more broadly. It used to be the case that only queries forced to be planned entirely on the gateway (e.g. mutations) would use the RootTxn, others would use the LeafTxn. Leaves don't do refreshes, and so they're safe (and suboptimal when a refresh would help).
Besides regular queries, schema changes used to use the OrigTimestampWasObserved bit too, so I think those probably are at risk in 19.1 and below.

If this theory is correct, this use of the wrong txn object steams from the fact that we put transactions in pErrs, for maximal confusion. It's always been a pet peeve of mine:
#18919 (comment)

The text was updated successfully, but these errors were encountered:

knz · 2019-10-08T14:52:41Z

I confirm this bug is present in master (post revert of Andrei's recent fixes),
also in 19.1!

also I have written a simplified version of the jepsen test in Go suitable for make stress:

monotonic_test.zip (place in pkg/sql then run make stress PKG=./pkg/sql TESTS=TestMonotonic)

Now also testing 2.1 for good measure.

knz · 2019-10-08T14:56:42Z

Confirming bug present in 2.1 too.

bdarnell · 2019-10-08T15:36:39Z

What do you mean by the bug being present in older releases? Just that we're looking at the wrong Transaction object, or is the monotonic test failing there too? If the latter, do you understand why this test is failing while the jepsen test has never (to my knowledge) hit this before on older versions?

knz · 2019-10-08T15:48:44Z

What do you mean by the bug being present in older releases? Just that we're looking at the wrong Transaction object, or is the monotonic test failing there too?

Both

If the latter, do you understand why this test is failing while the jepsen test has never (to my knowledge) hit this before on older versions?

Yes I understand it.

The Jepsen test separates the phases in a multi-statement txn, using standalone SELECTs first to find the max and cluster timestamp, and then a separate INSERT ... VALUES for the mutation. These standalone SELECTs are using LeafTxn objects since at least 2.1 (ever since we use distsql to serve them, which may have been 2.0), so were never able to auto-refresh (instead any conflict would have caused a client-side retry)

In order to tickle the bug, you need a read that encounters a refresh error on a RootTxn object with a mutation that also gets a call to cluster_logical_timestamp(). To get this, you need a SQL that combines INSERT with cluster_logical_timestamp() (and possibly some SELECT too, I'm not sure), which is what my simplified Go code does:

INSERT INTO t.t%d(v, ts) SELECT max(x)+1, if(pg_sleep(0.005*random()), cluster_logical_timestamp(), 0)
  FROM (VALUES
   ((SELECT max(v) FROM t.t1)),
   ((SELECT max(v) FROM t.t2))
) AS x(x)

This triggers the bug because in 2.1 19.1 and current master, a mutation is planned locally and is not distributed and so everything uses the RootTxn.

I'd say client apps that only use INSERT ... VALUES are unaffected.

However an UPDATE combined with cluster_logical_timestamp() is likely to be affected since 2.1 too.

knz · 2019-10-08T16:00:07Z

Andrei tells me that it's not only queries that use cluster_logical_timestamp() that are affected; anything using the txn.OrigTimestamp() call probably had that problem too. That includes schema changes up to and including 19.1.

andreimatei · 2019-10-08T17:37:58Z

The theory about us looking at pErr.Txn being wrong didn't quite pan out. On the response path, the DistSender updates pErr.Txn with the response's txn, which comes from the request's txn. I find all this very unlikely to be correct and I'll try to clean it up, but it's not our smoking gun. Fixing it does not make Jepsen happy.

However, I think the real problem is that cluster_logical_timestamp() is more clearly broken. It calls TxnCoordSender.CommitTimestmp(), which simply returns txn.OrigTimestamp. But, in case the txn already refreshed, the OrigTimestamp is not going to be the CommitTimestamp. The commit timestamp is:

commitTS := txn.OrigTimestamp
commitTS.Forward(txn.RefreshedTimestamp)

Before this patch, the txn.CommitTimestamp() function (which powers the cluster_logical_timestamp(), among others) was failing to take into consideration possible refreshes that might have happened since the current epoch began (i.e. txn.RefreshedTimestamp). Fixes cockroachdb#41424 Release note (bug fix): Fix a bug causing the cluster_logical_timestamp() function to sometimes return incorrect results.

bdarnell · 2019-10-08T18:24:18Z

in case the txn already refreshed

Ah, that makes sense. Originally refreshes were only possible at the end of the transaction, but early refreshes/pushes are now possible.

Before this patch, the txn.CommitTimestamp() function (which powers the cluster_logical_timestamp(), among others) was failing to take into consideration possible refreshes that might have happened since the current epoch began (i.e. txn.RefreshedTimestamp). Fixes cockroachdb#41424 Release note (bug fix): Fix a bug causing the cluster_logical_timestamp() function to sometimes return incorrect results.

41323: sql: bevy of fixes for ActiveRecord compatibility r=jordanlewis a=jordanlewis - improve efficiency of `'foo'::REGCLASS` - improve efficiency of `pg_get_coldesc` - parse INTERVAL(6) as a no-op meaning INTERVAL - bugfix to generate_subscripts for oidvector and int2vector - add pg_available_extensions - fix `pg_get_indexdef` and `pg_attrdef.adbin` to be more Postgres compatible With these fixes, 90% of the fork in `activerecord-cockroachdb-adapter` can go away. 41438: kv: fix cluster_logical_timestamp() r=andreimatei a=andreimatei Before this patch, the txn.CommitTimestamp() function (which powers the cluster_logical_timestamp(), among others) was failing to take into consideration possible refreshes that might have happened since the current epoch began (i.e. txn.RefreshedTimestamp). Fixes #41424 Release note (bug fix): Fix a bug causing the cluster_logical_timestamp() function to sometimes return incorrect results. Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com> Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>

Before this patch, the txn.CommitTimestamp() function (which powers the cluster_logical_timestamp(), among others) was failing to take into consideration possible refreshes that might have happened since the current epoch began (i.e. txn.RefreshedTimestamp). Fixes cockroachdb#41424 Release note (bug fix): Fix a bug causing the cluster_logical_timestamp() function to sometimes return incorrect results.

41439: release-19.2: kv: fix cluster_logical_timestamp() r=andreimatei a=andreimatei Backport 1/1 commits from #41438. /cc @cockroachdb/release --- Before this patch, the txn.CommitTimestamp() function (which powers the cluster_logical_timestamp(), among others) was failing to take into consideration possible refreshes that might have happened since the current epoch began (i.e. txn.RefreshedTimestamp). Fixes #41424 Release note (bug fix): Fix a bug causing the cluster_logical_timestamp() function to sometimes return incorrect results. Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>

andreimatei added A-kv-client Relating to the KV client and the KV interface. S-1 High impact: many users impacted, serious risk of high unavailability or data loss labels Oct 8, 2019

andreimatei self-assigned this Oct 8, 2019

andy-kimball mentioned this issue Oct 8, 2019

19.2 release blockers list #40447

Closed

53 tasks

knz changed the title ~~jepsen-monotonic failure when using RootTxn on the gateway~~ jepsen-monotonic failure when using RootTxn + cluster_logical_timestamp on the gateway Oct 8, 2019

knz changed the title ~~jepsen-monotonic failure when using RootTxn + cluster_logical_timestamp on the gateway~~ jepsen-monotonic failure when using RootTxn on the gateway Oct 8, 2019

andreimatei mentioned this issue Oct 8, 2019

kv: fix cluster_logical_timestamp() #41438

Merged

andreimatei mentioned this issue Oct 8, 2019

release-19.2: kv: fix cluster_logical_timestamp() #41439

Merged

andreimatei mentioned this issue Oct 8, 2019

release-19.1: kv: fix cluster_logical_timestamp() #41441

Merged

andreimatei mentioned this issue Oct 8, 2019

release-2.1: kv: fix cluster_logical_timestamp() #41442

Merged

craig bot closed this as completed in b491dde Oct 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jepsen-monotonic failure when using RootTxn on the gateway #41424

jepsen-monotonic failure when using RootTxn on the gateway #41424

andreimatei commented Oct 8, 2019

knz commented Oct 8, 2019 •

edited

Loading

knz commented Oct 8, 2019

bdarnell commented Oct 8, 2019

knz commented Oct 8, 2019 •

edited

Loading

knz commented Oct 8, 2019

andreimatei commented Oct 8, 2019

bdarnell commented Oct 8, 2019

jepsen-monotonic failure when using RootTxn on the gateway #41424

jepsen-monotonic failure when using RootTxn on the gateway #41424

Comments

andreimatei commented Oct 8, 2019

knz commented Oct 8, 2019 • edited Loading

knz commented Oct 8, 2019

bdarnell commented Oct 8, 2019

knz commented Oct 8, 2019 • edited Loading

knz commented Oct 8, 2019

andreimatei commented Oct 8, 2019

bdarnell commented Oct 8, 2019

knz commented Oct 8, 2019 •

edited

Loading

knz commented Oct 8, 2019 •

edited

Loading