sql: add SHOW COMMIT TIMESTAMP to retrieve a causality token #80848

ajwerner · 2022-05-02T05:00:15Z

Relates to #7945

Release note (sql change): A new sql statement SHOW COMMIT TIMESTAMP has been
added. This statement can be used to retrieve the commit timestamp of the
current explicit transaction, current multi-statement implicit transaction, or
previous transaction. The statement may be used in a variety of settings to
maximize its utility in the face of connection pooling.

When used as a part of an explicit transaction, the statement implicitly
commits the transaction internally before being able to return a causality
token. This is similar to the RELEASE cockroach_restart behavior; after
issuing this statement, commands to the transaction will be rejected until
COMMIT is issued.

When used as part of a multi-statement implicit transaction, the statement
must be the final statement. If it occurs in the middle of a multi-statement
implicit transaction, it will be rejected with an error.

When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction. If there
was no transaction on the connection, or the previous transaction did not
commit (either rolled back or encountered an error), the command will fail with
an error code 25000 (InvalidTransactionState).

The SHOW COMMIT TIMESTAMP statement is idempotent; it can be issued
multiple times both inside of and after transactions and will return the same result.
One rough edge is that the cockroach sql cli command will, by default, send
statements on behalf of the user which will lead to repeated issuances of
SHOW COMMIT TIMESTAMP from the CLI returning different values. If one
disables syntax checking with \set check_syntax=false and one changes their
prompt1 to not require a query, perhaps with \set prompt1=%n@%M>;, the
command will become idempotent from the CLI.

cockroach-teamcity · 2022-05-02T05:00:23Z

This change is

stevekuznetsov · 2022-05-02T14:43:46Z

pkg/sql/conn_executor.go

+	if err == nil {
+		res.SetColumns(ctx, colinfo.ResultColumns{
+			{
+				Name:   "causality_token",


One area where a new user to the system can get hung up is all of the different ways that these timestamps are referred to - crdb_internal_mvcc_timestamp, cluster_logical_timestamp, "hybrid-logical clock timestamps", etc. While this result column name is specific and correct, it does add another descriptor to this concept and require that the user know how to connect the dots.

Fair, let's see what other folks say regarding names. I feel no strong attachment towards this one.

ajwerner · 2022-05-04T22:20:43Z

I'm thinking I'd like to extend this to work also in the case where you issue an implicit transaction with the special statement as the last statement in the simple protocol. This should be relatively simple to detect, but it probably means pushing the logic to detect our special statement into the pgwire package.

michae2 · 2022-05-05T21:11:18Z

Could it also be extended to an implicit transaction for a statement like INSERT ... RETURNING crdb_internal.commit_with_causality_token()? (Maybe that's outside the scope of this PR...)

This is cool!

ajwerner · 2022-05-17T20:38:29Z

INSERT ... RETURNING crdb_internal.commit_with_causality_token()

I think this is hard to reason about. Instead I'm thinking of supporting a final statement that looks like the one we support in explicit transactions which can be used to regain the autocommit behavior in simple protocol multi-statement batches.

tamird · 2022-05-19T18:46:45Z

Would it be possible to spell this COMMIT RETURNING? That syntax may extend to non-transactional statements better.

/cc @jakedt

ajwerner · 2022-05-19T19:14:47Z

Would it be possible to spell this COMMIT RETURNING?

It gets tricky when you think about the wire protocol. I don't think it's valid to send DataRow after a non-data returning command like COMMIT. One option is to treat it exactly as the current PR does, which is to say, returning that row as thouhg COMMIT RETURNING CAUSALITY TOKEN were syntactic sugar for the above SELECT crdb_internal.commit_with_causality_token(); and in the wire protocol gets called a SELECT 1 and tells the driver that it's still in a transaction.

{"Type":"DataRow","Values":[{"text":"1652986747918593000.0000000000"}]}
{"Type":"CommandComplete","CommandTag":"SELECT 1"}
{"Type":"ReadyForQuery","TxStatus":"T"}

I haven't played with it, but I suspect that there are drivers out there which would be unhappy with this behavior. It's also hard to imagine how to work this into drivers which manage transaction lifecycles themselves. Is the idea that if you had a pgx.Tx that you'd subsequently issue a Commit() after you issued the above using QueryRow?

Like if instead it sent back the below, which is data followed by saying there was a COMMIT command and now the connection is not in a transaction, I can't imagine that drivers will be happy.

{"Type":"DataRow","Values":[{"text":"1652986747918593000.0000000000"}]}
{"Type":"CommandComplete","CommandTag":"COMMIT"}
{"Type":"ReadyForQuery","TxStatus":"I"}

In summary, I guess I'm okay with different syntax but it's a little weird to treat something with a COMMIT as a select. I may have some things wrong or misunderstand the ask, and I'd love to be schooled on why my analysis is bunk.

tamird · 2022-05-19T21:10:00Z

Maybe it's useful to explain the motivation - in the current proposal where the SELECT and COMMIT are separate operations, doesn't the SELECT cause the transaction to be become non-retryable? I've been away from this a long time, but it would seem to me that any write-only transactions would become read-write using this (SELECT) syntax, which is undesirable in some cases.

ajwerner · 2022-05-19T21:29:39Z

doesn't the SELECT cause the transaction to be become non-retryable?

The SELECT only causes the transaction to become non-retryable if it returns rows. This special select only returns its row if the transaction successfully commits; the select actually internally is committing the transaction. This special select doesn't cause the transaction to become read-only, it causes the transaction to be unusable in any way except COMMIT. This is equivalent to what we do with RELEASE SAVEPOINT cockroach_restart, if you're familiar.

it would seem to me that any write-only transactions would become read-write using this (SELECT) syntax

What is a write-only transaction in SQL? It's not a concept in cockroach as far as I know. Is there something special about that concept? Most sql write commands do reads internally. The only way to avoid a read (i.e. do a "blind write") in SQL syntax is an UPSERT which provides every column.

I think if we were to introduce such a concept, it wouldn't be safe to just look at the top-level kinds. Note also that top-level selects statements can also happily do writes both via CTE and, eventually, UDFs. Even more wild is that a select statement can call certain special builtins which do schema changes.

ajwerner · 2022-05-19T21:34:51Z

Maybe it's useful to explain the motivation

The motivation is to be able to provide a client with the commit timestamp of a transaction which they've issued and successfully committed. A design goal is to make it such that you don't need to be particularly careful with your connection handling in order to associate the timestamp with the right connection. This constraint leads towards a design whereby the transaction is internally in cockroach committed before the wire protocol or client driver thinks it is so that as far as the client driver is concerned, the transaction is still open when the timestamp is communicated. Any approach which attempts to retrieve the timestamp after the driver issues COMMIT is risky; some drivers put connections back into a pool as soon as the transaction is committed. Mucking with the COMMIT syntax itself is also something of a non-starter. Most drivers handle it specially as part of the transaction object lifecycle. Hence the current design. We're lying to the driver about the transaction state and then allowing the user to achieve their goal and keeping the driver happy so long as this sentinel query is the final query in the transaction.

in the current proposal where the SELECT and COMMIT are separate operations,

Yeah, this is true, but the COMMIT is there just to move the connection back to a state of being usable. Nothing meaningful is allowed to happen between the two (except the other special cockroach thing: RELEASE SAVEPOINT cockroach_restart).

ajwerner · 2022-05-19T21:46:19Z

I guess revisiting this, I'm far from wedded to the actual statement we use. The current thing is pretty gross. I think using a COMMIT tag statement will be bad for client driver interactions but I'm thinking something like SHOW COMMIT TIMESTAMP or something might be better than the awkward hack I have in there now and would be a lot easier to detect because it'd be its own ast node in the grammar.

michae2 · 2022-05-19T22:32:11Z

I think using SELECT is ok. It's similar to SELECT CURRENT_TRANSACTION_ID() in SQL server.

Interestingly, it looks like there is pg_xact_commit_timestamp in postgres which I think can be used after the transaction commits (IIUC).

knz · 2022-05-26T16:10:48Z

FWIW another way to do this is to have COMMIT store the causality token in a session var, and have the client use SELECT get_config('...') to retrieve it after COMMIT completes.

bdarnell · 2022-05-26T16:21:55Z

I think using a COMMIT tag statement will be bad for client driver interactions but I'm thinking something like SHOW COMMIT TIMESTAMP or something might be better than the awkward hack

I think having something that looks like a SELECT changing the connection's transaction status to committed is also going to be bad for drivers. None of the options are great for drivers.

It's common for drivers to issue the COMMIT statement on your behalf and discard any results, so anything based on adding results to COMMIT is going to be awkward to use. For drivers that do let you call conn.execute("COMMIT") yourself, though, I don't think there's a fundamental obstacle to a variant like conn.query("COMMIT RETURNING TIMESTAMP"). We've seen this with the RETURNING clauses elsewhere - drivers don't "know" which commands return results and which ones don't, it's determined by a combination of what the server returns and which interface the caller uses.
Hiding the commit in a SELECT statement is likely to confuse a driver that may still want to issue its own COMMIT/ROLLBACK at the end of its transaction scope. I think this option is strictly worse than introducing something like COMMIT RETURNING TIMESTAMP.
An after-the-fact SELECT/SHOW commit_timestamp() is probably the most broadly acceptable solution driver-wise; this is a fairly common pattern in other databases (widely used in the case of mysql's select last_insert_id()). However, because the postgres idiom for this is INSERT RETURNING, postgres drivers may have less support for a post-txn query than other drivers. It may be hard to reuse the connection for this purpose before it gets returned to the pool.
Selecting the commit timestamp in between a RELEASE SAVEPOINT cockroach_restart and a COMMIT is on paper well-tolerated by all drivers, more or less matches the expected semantics of all the statements involved (for drivers which map savepoints to nested transactions), and cleanly guarantees that you're using the right connection, but in practice we know that getting people to adopt this syntax is difficult.

My vote is for a SELECT/SHOW statement that returns the token for the last transaction committed on this transaction. It should be supported both after a plain commit for connection pooling strategies that permit this, and after RELEASE SAVEPOINT when returning connections to the pool makes this a problem.

otan · 2022-05-30T03:31:51Z

(removing sql-experience whilst this in debate, feel free to re-add when it's time for another look)

ajwerner · 2022-05-30T22:44:13Z

Selecting the commit timestamp in between a RELEASE SAVEPOINT cockroach_restart and a COMMIT is on paper well-tolerated by all drivers, more or less matches the expected semantics of all the statements involved (for drivers which map savepoints to nested transactions), and cleanly guarantees that you're using the right connection, but in practice we know that getting people to adopt this syntax is difficult.

I'm cool with this. The major problem I have with this is that existing libraries which exist and utilize RELEASE SAVEPOINT cockroach_restart won't be able to adopt the causality token without a change to that library. The motivating use case was cockroach-go's crdb.ExecuteTx.

In the current implementation, one can do something like:

var hybridLogicalTimestamp apd.Decimal
if err := crdbpgx.ExecuteTx(ctx, client, pgx.TxOptions{}, func(tx pgx.Tx) error {
	if _, err := tx.Exec(
		ctx, 
		`INSERT INTO kv (key, value) VALUES ($1, $2);`,
		 uuid.New().String(), strings.Repeat("x", size),
	); err != nil {
		return err
	}
	return tx.QueryRow(
		ctx, `SELECT crdb_internal.commit_with_causality_token()`,
	).Scan(&hybridLogicalTimestamp)
}); err != nil {
	return err
}

One easy answer is just to allow either order. I'm happy to be convinced that it's not worth it. I think I understand the feedback and will do something that I hope will please the crowd.

knz · 2022-11-14T14:41:59Z

When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction.

Maybe worth mentioning in the release note that this behavior is idempotent.

ajwerner · 2022-11-14T16:29:19Z

Maybe worth mentioning in the release note that this behavior is idempotent.

Done. I'll file an issue to have the CLI warn if it's going to be issuing queries unbeknownst to the user.

ajwerner · 2022-11-14T16:29:44Z

@rafiss this seems like a PR you might be well suited to review. If not, please advise on who else might be.

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevekuznetsov)

pkg/sql/conn_executor_exec.go line 1825 at r3 (raw file):

// CommitWait state because such a savepoint had been installed when
// the transaction was committed via SHOW COMMIT TIMESTAMP. The act of
// checking whether the statement should be accepted also prevents any

this seems like a confusing side effect for a caller keep track of (i.e. the act of checking being a thing that toggles the flag). would it be more natural if "accepting the statement" is the thing that toggles the flag?

pkg/sql/conn_executor_show_commit_timestamp.go line 104 at r3 (raw file):

}

// execShowCommitTimestampInOpenState deals with the special statement

nit: wrong function name in the comment

pkg/sql/conn_io.go line 143 at r3 (raw file):

	// simple protocol Query message that contains a batch of 1 or more queries.
	LastInBatch bool
	// LastInBatch indicates that this command contains the second-to-last query

nit: variable name in the comment

pkg/sql/sem/builtins/builtins.go line 195 at r3 (raw file):

// committed. The implementation of this function here is a shim; the function
// is processed explicitly by the connExecutor.
const CommitWithCausalityTokenName = "commit_with_causality_token"

is this needed?

pkg/sql/logictest/testdata/logic_test/show_commit_timestamp line 211 at r3 (raw file):

subtest prepare

# You cannot prepare SHOW COMMIT TIMESTAMP because it is not preparable.

could you add a test in pgwire/testdata/ and Parse a SHOW COMMIT TIMESTAMP statement?

Fixes cockroachdb#79591 Relates to cockroachdb#7945 Release note (sql change): A new sql statement `SHOW COMMIT TIMESTAMP` has been added. This statement can be used to retrieve the commit timestamp of the current explicit transaction, current multi-statement implicit transaction, or previous transaction. The statement may be used in a variety of settings to maximize its utility in the face of connection pooling. When used as a part of an explicit transaction, the statement implicitly commits the transaction internally before being able to return a causality token. This is similar to the `RELEASE cockroach_restart` behavior; after issuing this statement, commands to the transaction will be rejected until `COMMIT` is issued. When used as part of a multi-statement implicit transaction, the statement must be the final statement. If it occurs in the middle of a multi-statement implicit transaction, it will be rejected with an error. When sent as a stand-alone single-statement implicit transaction, it will return the commit timestamp of the previously committed transaction. If there was no transaction on the connection, or the previous transaction did not commit (either rolled back or encountered an error), the command will fail with an error code 25000 (`InvalidTransactionState`).

ajwerner

TFTR!

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @stevekuznetsov)

pkg/sql/conn_executor_exec.go line 1825 at r3 (raw file):