Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support causality tokens #7945

Open
tbg opened this issue Jul 20, 2016 · 10 comments
Open

sql: support causality tokens #7945

tbg opened this issue Jul 20, 2016 · 10 comments
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-sql-executor SQL txn logic C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team X-nostale Marks an issue/pr that should be ignored by the stale bot

Comments

@tbg
Copy link
Member

tbg commented Jul 20, 2016

My apologies if this exists somewhere - wasn't able to find it.

We already support reading the causality token of a transaction via cluster_logical_timestamp, but we provide no way for a causally downstream transaction to ingest that token (i.e. making sure that its assigned timestamp is at least that represented by the token).

When implementing this, we must validate the token. This is a tricky problem when clients are not trusted - the token will make it into the hybrid logical clocks, so supplying a bogus high value could easily trigger our clock offset mechanisms (or even worse, not trigger them and lead to stale reads).
The baseline is making sure we're not MaxOffset ahead of the current HLC timestamp, but that's not enough. Perhaps causality tokens which can be passed in that way ought to be certified by the node certs and be totally opaque to the client.

Jira issue: CRDB-6165

@petermattis petermattis added this to the Later milestone Feb 22, 2017
@jordanlewis jordanlewis added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-transactions Relating to MVCC and the transactional model. labels Apr 26, 2018
@knz
Copy link
Contributor

knz commented May 9, 2018

Discussed today extensively with @bdarnell and @andreimatei

  • as an accident of implementation, cluster_logical_timestamp() is a causality token with a very poor UX
  • we should make a more dedicated SQL syntax to pass causality tokens
  • andrei has this idea to pass a token that would bump the HLC clock on the node receiving the token

@andreimatei can you flesh out your thoughts here?

@knz knz added the A-sql-executor SQL txn logic label May 9, 2018
@knz
Copy link
Contributor

knz commented May 9, 2018

(For context: we're having this discussion again because there were multiple community questions around this area recently)

@petermattis petermattis removed this from the Later milestone Oct 5, 2018
@ajwerner
Copy link
Contributor

When implementing this, we must validate the token. This is a tricky problem when clients are not trusted - the token will make it into the hybrid logical clocks, so supplying a bogus high value could easily trigger our clock offset mechanisms (or even worse, not trigger them and lead to stale reads).

One way we could mitigate the risk of the client causality token pushing the clock too far into the future would be to force the client to wait until the wall time of timestamp to which it wants to push the HLC is within max offset of the wall clock on the gateway machine. If a client tries to push the clock an hour ahead, it's just going to sleep for an hour. We don't expect the causality token to be far in the future.


As for different interfaces, there's been some offline discussions. One way would be to provide a sql interface. Without driver support anything we do here is likely to be clunky.

Imagine a function commit_with_timestamp() which implicitly commits the transaction and returns the commit timestamp. The only valid statement after issuing a select from this function would be to call commit. The connExecutor would be moved to a committed state. These are exactly the semantics of RELEASE SAVEPOINT today (see docs).

In this proposal a client could obtain a causality token as follows:

BEGIN;
...
SELECT commit_timestamp(); -- gives back a DECIMAL hlc timestamp
COMMIT

As for the client using it, perhaps we add something like:

SELECT crdb_wait($1);

Which will sleep until the timestamp is at least close to the wall clock and then bump the gateway HLC.

Drivers could help make the use of the causality token seamless by injecting queries on behalf of the user. I suspect that the reading of the timestamp and properly passing it to the client driver would be the biggest concern. Perhaps the driver can translate Commit() directly to this sequence. The driver may also inject the waiting command prior to every statement while not in an open transaction.

The causality token will need to be ratcheted across all connections in the connection pool. Commit in the driver shouldn't return to the caller until the causality token has been propagated.

Another way the token could be transferred to a client would be through a NoticeResponse in the pg wire protocol indicating the commit timestamp. We could use a connection setting to determine whether the data is sent. I haven't been able to come up with a better protocol extension to provide the causality token but I suspect there is one. Perhaps we could have a pre-defined bound statement so that each connection doesn't need to prepare it? On the whole a protocol-layer extension would likely lead to less confusing documentation and user experiences when it ultimately is adopted. The downside is that it'll be impossible to adopt it above an existing driver.

Another thing which seems like it could be a problem re: causality tracking is #7526 where causality may not propagate properly across distsl during distributed execution.

@knz
Copy link
Contributor

knz commented Nov 22, 2019

One way we could mitigate the risk of the client causality token pushing the clock too far into the future would be to force the client to wait until the wall time of timestamp

This has also been discussed before. In fact we even have recommended users to do so using existing SQL features.

Say your previous sampled value is X, with the same unit (nanos+logical) as cluster_logical_timestamp, let's say value X, then you can wait for your causality-sufficient delay using:

SELECT pg_sleep(X::INT::FLOAT / 1e9 - clock_timestamp()::FLOAT + <maxoffset in secs>)

(the timestamp -> float/decimal conversion gives fractional seconds; X::INT gets rid of the logical part, and Y::FLOAT/1e9 moves the decimal point to the right place; finally, pg_sleep can wait for fractional seconds up to microsecond precision)

ISTM we should provide syntactic sugar for this formula but maybe we don't need much more!

As for the client using it, perhaps we add something like:
SELECT crdb_wait($1);

Good idea, I like that! Good syntactic sugar for the above.

As for different interfaces, [...] Without driver support anything we do here is likely to be clunky.

So here there's some data points missing:

  1. it's likely that causality tokens are super important very soon, much earlier than we are able to provide drivers (and/or earlier than our users are able to adopt them). So we have a large incentive to find a solution that does not require custom drivers.

  2. I disagree that it's going to be clunky.

We already provide "interstitial statements" that can be issued by clients outside of the SQL transaction state machine: SHOW SYNTAX. We can envision a SHOW TRANSACTION CAUSALITY TOKEN that also can be issued as such a freebie, just after a COMMIT, and displays the token of the last COMMIT. Or it could be used just after the RELEASE SAVEPOINT cockroach_restart and before the final COMMIT, in case the client framework code does not support issuing statement ouside of a BEGIN...COMMIT block.

Alternatively, we can store the token of the last txn in a session var so that the client can use SHOW last_txn_token to display the token of the last fully committed txn.

However Ben pointed out that we should not rely on syntax that must be used outside of a begin..commit block because clients often use a connection pool and may not control which connection is used for the statement just after a COMMIT. Hence a big focus on something in-between the RELEASE SAVEPOINT cockroach_restart and before the final COMMIT.

  1. It's actually supper important to have a separate statement to sample the token so as not to use cluster_logical_statement() itself, at least as long as that function blocks txn refreshes.

@bdarnell
Copy link
Contributor

it's likely that causality tokens are super important very soon, much earlier than we are able to provide drivers (and/or earlier than our users are able to adopt them). So we have a large incentive to find a solution that does not require custom drivers.

Why do you think this? I feel strongly in the opposite direction: I don't believe we can make causality tokens usable enough to be relevant without custom drivers.

However Ben pointed out that we should not rely on syntax that must be used outside of a begin..commit block because clients often use a connection pool and may not control which connection is used for the statement just after a COMMIT.

This is one way being in the driver can be important. The driver can do a COMMIT followed by a SHOW last_txn_token even if the application can't (this pattern is in fact common for mysql which lacks INSERT RETURNING). I'd rather design the syntax with the assumption that it will be implemented by drivers than introduce more RELEASE SAVEPOINT cockroach_restart style hacks that appear to get the job done without driver support but are very tricky to use in practice.

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@knz
Copy link
Contributor

knz commented Jun 10, 2021

Still relevant, and we even discussed it on slack in the past few weeks. (internal link)

@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
ajwerner added a commit to ajwerner/cockroach that referenced this issue May 2, 2022
Fixes cockroachdb#79591

Relates to cockroachdb#7945

Release note (<category, see below>): A new adds a new special sentinel builtin
`crdb_internal.commit_with_causality_token` has been added which can be used
exclusively as the only expression in a `SELECT` statement at the end of an
explicit transaction for the purpose of committing the the transaction and
retrieving the now committed transaction's commit timestamp. This timestamp
can then be used as the input to, say, an AS OF SYSTEM TIME clause or a
changefeed in order to observe the exact snapshot at which the transaction
committed. Note that after issuing such a statement successfully, the client
must still issue a COMMIT to reset the connection state for the next
transaction.
ajwerner added a commit to ajwerner/cockroach that referenced this issue May 3, 2022
Fixes cockroachdb#79591

Relates to cockroachdb#7945

Release note (<category, see below>): A new adds a new special sentinel builtin
`crdb_internal.commit_with_causality_token` has been added which can be used
exclusively as the only expression in a `SELECT` statement at the end of an
explicit transaction for the purpose of committing the the transaction and
retrieving the now committed transaction's commit timestamp. This timestamp
can then be used as the input to, say, an AS OF SYSTEM TIME clause or a
changefeed in order to observe the exact snapshot at which the transaction
committed. Note that after issuing such a statement successfully, the client
must still issue a COMMIT to reset the connection state for the next
transaction.
ajwerner added a commit to ajwerner/cockroach that referenced this issue Nov 14, 2022
Fixes cockroachdb#79591

Relates to cockroachdb#7945

Release note (sql change): A new sql statement `SHOW COMMIT TIMESTAMP` has been
added. This statement can be used to retrieve the commit timestamp of the
current explicit transaction, current multi-statement implicit transaction, or
previous transaction. The statement may be used in a variety of settings to
maximize its utility in the face of connection pooling.

When used as a part of an explicit transaction, the statement implicitly
commits the transaction internally before being able to return a causality
token. This is similar to the `RELEASE cockroach_restart` behavior; after
issuing this statement, commands to the transaction will be rejected until
`COMMIT` is issued.

When used as part of a multi-statement implicit transaction, the statement
must be the final statement. If it occurs in the middle of a multi-statement
implicit transaction, it will be rejected with an error.

When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction. If there
was no transaction on the connection, or the previous transaction did not
commit (either rolled back or encountered an error), the command will fail with
an error code 25000 (`InvalidTransactionState`).
ajwerner added a commit to ajwerner/cockroach that referenced this issue Nov 21, 2022
Fixes cockroachdb#79591

Relates to cockroachdb#7945

Release note (sql change): A new sql statement `SHOW COMMIT TIMESTAMP` has been
added. This statement can be used to retrieve the commit timestamp of the
current explicit transaction, current multi-statement implicit transaction, or
previous transaction. The statement may be used in a variety of settings to
maximize its utility in the face of connection pooling.

When used as a part of an explicit transaction, the statement implicitly
commits the transaction internally before being able to return a causality
token. This is similar to the `RELEASE cockroach_restart` behavior; after
issuing this statement, commands to the transaction will be rejected until
`COMMIT` is issued.

When used as part of a multi-statement implicit transaction, the statement
must be the final statement. If it occurs in the middle of a multi-statement
implicit transaction, it will be rejected with an error.

When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction. If there
was no transaction on the connection, or the previous transaction did not
commit (either rolled back or encountered an error), the command will fail with
an error code 25000 (`InvalidTransactionState`).
craig bot pushed a commit that referenced this issue Nov 22, 2022
80848: sql: add SHOW COMMIT TIMESTAMP to retrieve a causality token r=ajwerner a=ajwerner

Fixes #79591

Relates to #7945

Release note (sql change): A new sql statement `SHOW COMMIT TIMESTAMP` has been
added. This statement can be used to retrieve the commit timestamp of the
current explicit transaction, current multi-statement implicit transaction, or
previous transaction. The statement may be used in a variety of settings to
maximize its utility in the face of connection pooling.

When used as a part of an explicit transaction, the statement implicitly
commits the transaction internally before being able to return a causality
token. This is similar to the `RELEASE cockroach_restart` behavior; after
issuing this statement, commands to the transaction will be rejected until
`COMMIT` is issued.

When used as part of a multi-statement implicit transaction, the statement
must be the final statement. If it occurs in the middle of a multi-statement
implicit transaction, it will be rejected with an error.

When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction. If there
was no transaction on the connection, or the previous transaction did not
commit (either rolled back or encountered an error), the command will fail with
an error code 25000 (`InvalidTransactionState`).

The `SHOW COMMIT TIMESTAMP` statement is idempotent; it can be issued
multiple times both inside of and after transactions and will return the same result.
One rough edge is that the `cockroach sql` cli command will, by default, send
statements on behalf of the user which will lead to repeated issuances of 
`SHOW COMMIT TIMESTAMP` from the CLI returning different values. If one
disables syntax checking with `\set check_syntax=false` and one changes their
prompt1 to not require a query, perhaps with `\set prompt1=%n@%M>;`, the
command will become idempotent from the CLI.

91832: CODEOWNERS: updater importer to sql-exp r=rafiss a=dt

Release note: none.
Epic: none.

92338: util/stop: finish task span before Stop() returns r=andreimatei a=andreimatei

Before this patch, a task's span was finished after the stopper considered the task to be finished. This was a problem for a test who wanted to assume that, once stopper.Stop() returns, all task spans are finished - which is a reasonable contract to expect. This patch reorders the span finish accordingly.

Fixes #83886

Release note: None
Epic: None

Co-authored-by: Andrew Werner <awerner32@gmail.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023
@knz knz added X-nostale Marks an issue/pr that should be ignored by the stale bot and removed X-stale labels Oct 10, 2023
@knz
Copy link
Contributor

knz commented Oct 10, 2023

still relevant

@knz knz reopened this Oct 10, 2023
@brianbraunstein
Copy link

Any progress or update here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-sql-executor SQL txn logic C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team X-nostale Marks an issue/pr that should be ignored by the stale bot
Projects
No open projects
Status: Incoming
Development

No branches or pull requests

9 participants