Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: ROLLBACK TO SAVEPOINT does not release locks #46075

Closed
knz opened this issue Mar 13, 2020 · 22 comments · Fixed by #41569
Closed

sql: ROLLBACK TO SAVEPOINT does not release locks #46075

knz opened this issue Mar 13, 2020 · 22 comments · Fixed by #41569
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation docs-todo

Comments

@knz
Copy link
Contributor

knz commented Mar 13, 2020

tldr: The new implementation of savepoints and SELECT FOR UPDATE (row locks) are currently behaving in a way incompatible with postgres. This is in direct violation to our commitment to pg compatibility, and introduces the risk of application-level deadlocks.

This issue is to ensure ROLLBACK TO SAVEPOINT either release locks, like in pg, or fail with an "unimplemented error".

This has been discussed with @tbg

Details

  • pg says "ROLLBACK TO SAVEPOINT" must release locks placed "under" the savepoint
  • crdb currently does not do this and keeps the locks active.

There are two problems with this:

  • this is different from pg (violation to our commitment to pg compat)
  • it creates a risk of application-level deadlock. I recall at least one occurrence of a test that demonstrates this in the Hibernate test suite. The application takes a lock, then starts a separate thread that waits on the lock, then in the first thread releases the lock with ROLLBACK TO SAVEPOINT, then expects the separate threat to resume execution. Today this would not work in crdb.

Nuance: rollback after a serialization error ("retry error")

The current crdb behavior is motivated as follows: after a retry error, keeping the locks active during ROLLBACK TO SAVEPOINT increases the chances that the new txn attempt will succeed.

However the current implementation keeps the locks unconditionally, causing the pg compat problem described above.

What we can do instead is the following:

  • After a retry error keep the locks and allow ROLLBACK TO SAVEPOINT
  • After other types of error either release the locks, like in pg, or report an error to the client that ROLLBACK TO SAVEPOINT is not supported after a FOR UPDATE query

Nuance: implicit locks under UPDATE statements

The implementation places locks for UPDATE statements. Naively preventing ROLLBACK TO SAVEPOINT after locks have been placed would block the ability to do ROLLBACK TO SAVEPOINT after an UPDATE statement. This would strongly limit the savepoint feature.

However, looking at the detail: the locks laid by an UPDATE statement are subsequently converted to write intents. If our understanding (knz+@tbg) is correct, there is no lock left over after the UPDATE statement completes (assuming no txn conflict yielding a retry error). Intents can be rolled back without issue with ROLLBACK TO SAVEPOINT.

@knz
Copy link
Contributor Author

knz commented Mar 13, 2020

@nvanbenschoten @tbg @andreimatei please comment on this issue ASAP about whether we're overseeing anything. For now this issue is a release blocker.
I'll work on this next week.

@knz knz added A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Mar 13, 2020
@knz knz self-assigned this Mar 13, 2020
@andreimatei
Copy link
Contributor

Intents can be rolled back without issue with ROLLBACK TO SAVEPOINT.

There's no difference between intents and unreplicated locks (SFU) from the perspective of savepoints. Neither are rolled back by a savepoint. Both continue blocking a reader or writer after the rollback to savepoint.
It's not trivial to release the locks either because we don't track what locks were acquired at which sequence number. We also like condensing the locks into larger spans to stay below a memory budget - and so we lose fidelity.

I think that when we can, it'd be good to release locks early, but I don't think it's nearly as important as you do. I find it hard to believe that an application would rely on this behavior. As far as I can find, Postgres does not document this behavior (and in general locking is not something that's documented well). How would one code such an application? It'd need to rollback to a savepoint in one transaction, and afterwards kick off work on a different connection while blocking the original transaction on that work? My guess is that the Hibernate test is synthetic; I'd like to see it.

@knz
Copy link
Contributor Author

knz commented Mar 13, 2020

While we investigate I think it's reasonable to add the error message "this is not supported" until we get clarity. It would be a mistake to let something behave in a way we're not confident about, let users get used to it and start depending on, then later change it.

@andreimatei
Copy link
Contributor

"This" being a rollback in any transaction that has written anything?

@knz
Copy link
Contributor Author

knz commented Mar 13, 2020

"This" being a rollback in any transaction that has written anything?

that has locked anything with in-memory locks I suppose?

@andreimatei
Copy link
Contributor

Well but as I was telling you, intents have just the same effect as the unreplicated locks.

@andy-kimball
Copy link
Contributor

andy-kimball commented Mar 13, 2020

  1. We have precedent for this behavior: For years now, we have had a limited form of rollback (to support automatic restarts) and a limited form of locks (intents). We also enabled SFU syntax in 19.2. We've never removed intents during rollback.

  2. PG incompatibility is low risk for this case: Like Andrei, I struggle to think of real application patterns where the locking rollback behavior will cause problems. So, while I concede that our locking behavior is not fully compatible with PG's, I see it as low likelihood that this will cause problems for customers.

  3. We have good reasons to retain locks: We made an explicit decision to keep intents when we retry. This is necessary to avoid starvation scenarios - not theoretical starvation scenarios, but actual ones that our customers have run into. Retry is a special case of explicit rollback, but the same factors apply to explicit rollback. Also, I'd prefer to have one consistent behavior, both for the general case and for specific cases. This is a distributed system, and locking behavior is necessarily going to differ from the single machine case. I think it's OK (and even desirable) to diverge from PG behavior here.

My proposal would be to add a docs item to call out this difference in behavior.

@andy-kimball
Copy link
Contributor

As a side note, @andreimatei, the locking rollback behavior is documented by PG:

Once acquired, a lock is normally held till end of transaction. But if a lock is acquired after establishing a savepoint, the lock is released immediately if the savepoint is rolled back to. This is consistent with the principle that ROLLBACK cancels all effects of the commands since the savepoint. The same holds for locks acquired within a PL/pgSQL exception block: an error escape from the block releases locks acquired within it.

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Mar 13, 2020

I'm confused about the sudden urgency of this. We discussed the question of whether ROLLBACK TO SAVEPOINT should eagerly release locks back in October and all decided that for the initial implementation, we wouldn't eagerly release locks on ROLLBACK. Why the sudden change?

As others here have mentioned, this is orthogonal to SFU and unreplicated locks. Unreplicated locks support partial rollback based on IgnoredSeqNums just like any other locks. The problem is that the txn coordinator never initiates this partial rollback.

Andrei and I actually discussed this a few days ago. Everything below the KV client supports partial rollbacks of locks, so eagerly releasing them on rollback using ResolveIntent requests would not be too difficult. This is the only real blocker we saw to this approach:

It's not trivial to release the locks either because we don't track what locks were acquired at which sequence number.

That said, I'm now thinking that there might be a different approach that would avoid the O(num locks released) behavior on rollbacks and would also avoid the requirement that we need to track these lock spans with full sequence number fidelity. Instead of resolving each lock within the set of IgnoredSeqNums on rollback, we could instead simply augment a transaction's record with its new set of IgnoredSeqNums on rollback. We could then begin passing conflicting sequence number around in txn pushes and take the IgnoredSeqNums set into account during txn push evaluation. This would place the burden on removing rolled back locks while the txn is still running on other conflicting transactions, but that's not necessarily a bad thing because it means that we would only eagerly roll back the locks that actually need to be rolled back. There are some complications here around locks written at multiple sequence numbers, but it all seems doable (but definitely not in the stability period).

@nvanbenschoten
Copy link
Member

Intents can be rolled back without issue with ROLLBACK TO SAVEPOINT.

Maybe this is whether the confusion is coming from. This statement is incorrect.

@knz
Copy link
Contributor Author

knz commented Mar 14, 2020

Thanks everyone for the discussion.
In summary, the current behavior is going to be a known departure from pg semantics and this needs to be called out in docs in several places (IIRC it was not previously).

I think we can hand over the handling of this to @rmloveland and ensure that cockroachdb/docs#6675 gets augmented to properly cover this topic.

I also would like to ensure that @awoods187 propagates this newfound knowledge/understanding to the SQL and AppDev teams. @awoods187 can you carry this over?

@knz knz changed the title sql: ROLLBACK TO SAVEPOINT after SFU must either release locks or fail sql: ROLLBACK TO SAVEPOINT after SFU does not release locks Mar 14, 2020
@nvanbenschoten nvanbenschoten changed the title sql: ROLLBACK TO SAVEPOINT after SFU does not release locks sql: ROLLBACK TO SAVEPOINT does not release locks Mar 14, 2020
@awoods187
Copy link
Contributor

Did we confirm that the Hibernate test is not modeling a behavior that users may intend to take advantage of? I.e. is it testing this behavior on purpose or by accident?

Regardless of the above, it seems reasonable to proceed with this approach for 20.1. We can then gather more information in the 20.2 cycle about prioritizing eagerly releasing these locks (including factoring in the answer to the above question).

@knz
Copy link
Contributor Author

knz commented Mar 17, 2020

is it testing this behavior on purpose or by accident?

Definitely on purpose.

Did we confirm that the Hibernate test is not modeling a behavior that users may intend to take advantage of?

The hibernate test is constructed with a particular purpose, but that's not directly relevant here: Andy K and others have stated that regardless of user impact, we're going to keep the current CockroachDB behavior and document it as a known divergence.

That's an "architectural divergence" if you recall the terminology.

The next step here is to make it crystal clear to users that it's going to be different from pg and they must adapt their client code if they observe it's an issue.

@knz
Copy link
Contributor Author

knz commented Mar 17, 2020

@nvanbenschoten @andreimatei I am adding the following text to the RFC and I intend that change to the RFC (as well as merging the RFC PR) to also close htis issue as a result.
Can you review the phrasing?

Relationship with row locks

CockroachDB supports row locks: both write locks (write intents) and,
since v20.1, read locks (SELECT FOR UPDATE).

  • In PostgreSQL, row locks are released/cancelled upon ROLLBACK TO
    SAVEPOINT.
  • In CockroachDB, row locks are preserved upon ROLLBACK TO SAVEPOINT.

This is an architectural difference which we don't plan to lift any
time soon.

The motivation for preserving locks is to increase the chance that the
transaction will succeed the second time around, in case it encounters
a retry error some time before or at COMMIT time.

Client code that rely on row locks in client apps must be reviewed
and possibly modified to account for this difference. In particular,
if an application is relying on ROLLBACK TO SAVEPOINT to release
row locks and allow a concurrent txn touching the same rows to proceed,
this behavior will not work with CockroachDB.

@andreimatei
Copy link
Contributor

CockroachDB supports row locks: both write locks (write intents) and,
since v20.1, read locks (SELECT FOR UPDATE).

As per Nathan's wrath, we don't have "write locks" and "read locks". We have "replicated" and "unreplicated" exclusive locks.

The motivation for preserving locks is to increase the chance that the
transaction will succeed the second time around, in case it encounters
a retry error some time before or at COMMIT time.

Well, that's the motivation for not releasing locks when a retriable error is encountered, like you say. The motivation for not releasing locks generally upon a rollback to savepoint is simply that it's hard to implement.

@nvanbenschoten
Copy link
Member

Thanks for writing that up Raphael. I expect that some of that language will also end up in the docs.

This is an architectural difference which we don't plan to lift any
time soon.

This makes it sound like there's a fundamental reason why the CockroachDB architecture cannot remove this difference. I think that's too harsh. If we had another month in this release then we could absolutely address this using either of the approaches discussed earlier in this thread.

As per Nathan's wrath, we don't have "write locks" and "read locks". We have "replicated" and "unreplicated" exclusive locks.

For reference, here are the lock strength and durability definitions. Like you mentioned, we currently support only replicated and unreplicated exclusive locks. In the future, we will likely change the locks acquired by implicit and explicit SFU to be "unreplicated upgrade" locks, which will improve performance on workloads like YCSB-B (95% read, 5% upgrade, zipfian distribution).

@knz
Copy link
Contributor Author

knz commented Mar 17, 2020

Changed the phrasing to:

CockroachDB supports exclusive row locks. (They come in two variants -
distributed as write intents and, since v20.1, also non-distributed
from SELECT FOR UPDATE. However this distinction is not material
here).

  • In PostgreSQL, row locks are released/cancelled upon ROLLBACK TO
    SAVEPOINT.
  • In CockroachDB, row locks are preserved upon ROLLBACK TO SAVEPOINT.

This is an architectural difference in v20.1 that may or may
not be lifted in a later CockroachDB version.

The main motivation for preserving locks is to increase the chance
that the transaction will succeed the second time around, in case it
encounters a retry error some time before or at COMMIT time.

Client code that rely on row locks in client apps must be reviewed
and possibly modified to account for this difference. In particular,
if an application is relying on ROLLBACK TO SAVEPOINT to release
row locks and allow a concurrent txn touching the same rows to proceed,
this behavior will not work with CockroachDB.

WDYT?

@nvanbenschoten
Copy link
Member

That sounds good to me.

@andreimatei
Copy link
Contributor

The main motivation for preserving locks is to increase the chance
that the transaction will succeed the second time around, in case it
encounters a retry error some time before or at COMMIT time.

I still take objection to this phrasing :). This paragraphs claims that a very specific situation in which we like holding locks is the general reason why we don't roll them back. That's simply not true, and worse, it's confusing cause it just leads to obvious questions. We could easily separate the case when we want the locks from the general case where we don't. The problem with releasing locks in the general case where we would like to release them is that it's not so easy to do (at least I don't think it is). Nathan was suggesting above that perhaps instead of having the kvclient be responsible for releasing locks on rollback, it'd be pushers that become able to push some locks out of their way. I think that'd work for intents, but I'm not sure how it'd work for the non-replicated locks - the reads that take these locks don't currently have sequence numbers so we'd need to change that. Also needing to do better tracking of locks seems like it'd limit our ability to merge locks together, etc.
Basically, it doesn't seem all that easy to me to implement this.

@knz
Copy link
Contributor Author

knz commented Mar 18, 2020

Look guys the situation is simple:

  • there's a divergence from pg
  • people are going to ask "why"
  • we need an explanation

I don't care what the explanation is. If you think the best explanation is "we plan to do this, but it's not done yet", it's good. If you think the best explanation is "we can't do this, it's really hard, just deal with it", it's also good.

I just need some words that y'all are happy with. Instead of bashing my draft, give me a counter proposal. Thank you.

@andreimatei
Copy link
Contributor

I just need some words that y'all are happy with. Instead of bashing my draft, give me a counter proposal.

Well you're insisting on writing something here :). Find a way to summarize the discussion. It takes some time, that's why I didn't just do it... How about

Instead of:

The main motivation for preserving locks is to increase the chance
that the transaction will succeed the second time around, in case it
encounters a retry error some time before or at COMMIT time.

"
Releasing locks on a savepoint rollback is not trivial because we'd have to track what sequence number each lock was acquired at, and we'd have to maintain full fidelity for this tracking (preventing us from collapsing different spans in the footprint to save memory). One option is to not eagerly release locks, but to update the transaction record upon a rollback with a list of ignored seq nums. Then pushers could be allowed to push locks corresponding to ignored seq nums. This would mean that the kvserver would have to maintain full fidelity on the locks, however.
There's also a case where we benefit from holding on to locks - when rolling back to an initial savepoint after a retriable error. In that case, chances are the same locks will be requested again, so holding on to them ensures the transaction can acquire them on the retry.
"

@knz
Copy link
Contributor Author

knz commented Mar 18, 2020

Thanks! I have updated the RFC with this text, split in two parts:

  • a short section without rationale in the guide-level section
  • a longer explanation in the reference-level section.

@craig craig bot closed this as completed in 7303b7f Mar 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-known-limitation docs-todo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants