Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE #40205

Closed
jordanlewis opened this issue Aug 26, 2019 · 6 comments · Fixed by #45701
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Aug 26, 2019

#40206 added parsing for the various row-level lock modes, and treats them like no-ops. This is acceptable for CockroachDB, which only supports SERIALIZABLE transactions, because from a client's perspective there's no way to observe whether any particular transaction needed to be restarted because of a conflict on the rows that were declared as FOR UPDATE, or because of some other conflict - the transaction will have to be restarted in either case.

This issue tracks improving this behavior for performance reasons. There's ongoing work in core to support at least some of these semantics.

@jordanlewis jordanlewis added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-pgcompat Semantic compatibility with PostgreSQL X-anchored-telemetry The issue number is anchored by telemetry references. labels Aug 26, 2019
craig bot pushed a commit that referenced this issue Aug 27, 2019
40206: sql: add row lock modes; make FOR UPDATE a no-op r=jordanlewis a=jordanlewis

This commit adds parsing for the row locking modes: FOR UPDATE, FOR NO
KEY UPDATE, FOR SHARE, FOR KEY UPDATE.

All return unimplemented errors, except for FOR UPDATE, which behaves as
a no-op.

Closes #6583.
Added #40205 to track the remaining kinds of row locking.

Release note (sql change): support parsing the FOR UPDATE modifier on
SELECT clauses, treating it as a no-op, since CockroachDB's transactions
only operate in SERIALIZABLE mode.

Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>
@jordanlewis jordanlewis changed the title sql: implement FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE Aug 27, 2019
jordanlewis added a commit to jordanlewis/cockroach that referenced this issue Oct 2, 2019
The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep
up to date. If we write down blacklists in code, then we can use an approach
like this to always have an up to date aggregation.

So far it seems like there's just a lot of unknowns to categorize still.

The output today:

```
=== RUN   TestBlacklists
 648: unknown                                                (unknown)
 493: cockroachdb#5807   (sql: Add support for TEMP tables)
 151: cockroachdb#17511  (sql: support stored procedures)
  86: cockroachdb#26097  (sql: make TIMETZ more pg-compatible)
  56: cockroachdb#10735  (sql: support SQL savepoints)
  55: cockroachdb#32552  (multi-dim arrays)
  55: cockroachdb#26508  (sql: restricted DDL / DML inside transactions)
  52: cockroachdb#32565  (sql: support optional TIME precision)
  39: cockroachdb#243    (roadmap: Blob storage)
  33: cockroachdb#26725  (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea))
  31: cockroachdb#27793  (sql: support custom/user-defined base scalar (primitive) types)
  24: cockroachdb#12123  (sql: Can't drop and replace a table within a transaction)
  24: cockroachdb#26443  (sql: support user-defined schemas between database and table)
  20: cockroachdb#21286  (sql: Add support for geometric types)
  18: cockroachdb#6583   (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait}))
  17: cockroachdb#22329  (Support XA distributed transactions in CockroachDB)
  16: cockroachdb#24062  (sql: 32 bit SERIAL type)
  16: cockroachdb#30352  (roadmap:when CockroachDB  will support cursor?)
  12: cockroachdb#27791  (sql: support RANGE types)
   8: cockroachdb#40195  (pgwire: multiple active result sets (portals) not supported)
   8: cockroachdb#6130   (sql: add support for key watches with notifications of changes)
   5: Expected Failure                                       (unknown)
   5: cockroachdb#23468  (sql: support sql arrays of JSONB)
   5: cockroachdb#40854  (sql: set application_name from connection string)
   4: cockroachdb#35879  (sql: `default_transaction_read_only` should also accept 'on' and 'off')
   4: cockroachdb#32610  (sql: can't insert self reference)
   4: cockroachdb#40205  (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE)
   4: cockroachdb#35897  (sql: unknown function: pg_terminate_backend())
   4: cockroachdb#4035   (sql/pgwire: missing support for row count limits in pgwire)
   3: cockroachdb#27796  (sql: support user-defined DOMAIN types)
   3: cockroachdb#3781   (sql: Add Data Type Formatting Functions)
   3: cockroachdb#40476  (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`)
   3: cockroachdb#35882  (sql: support other character sets)
   2: cockroachdb#10028  (sql: Support view queries with star expansions)
   2: cockroachdb#35807  (sql: INTERVAL output doesn't match PG)
   2: cockroachdb#35902  (sql: large object support)
   2: cockroachdb#40474  (sql: support `SELECT ... FOR UPDATE OF` syntax)
   1: cockroachdb#18846  (sql: Support CIDR column type)
   1: cockroachdb#9682   (sql: implement computed indexes)
   1: cockroachdb#31632  (sql: FK options (deferrable, etc))
   1: cockroachdb#24897  (sql: CREATE OR REPLACE VIEW)
   1: pass?                                                  (unknown)
   1: cockroachdb#36215  (sql: enable setting standard_conforming_strings to off)
   1: cockroachdb#32562  (sql: support SET LOCAL and txn-scoped session variable changes)
   1: cockroachdb#36116  (sql: psychopg: investigate how `'infinity'::timestamp` is presented)
   1: cockroachdb#26732  (sql: support the binary operator: <int> / <float>)
   1: cockroachdb#23299  (sql: support coercing string literals to arrays)
   1: cockroachdb#36115  (sql: psychopg: investigate if datetimetz is being returned instead of datetime)
   1: cockroachdb#26925  (sql: make the CockroachDB integer types more compatible with postgres)
   1: cockroachdb#21085  (sql: WITH RECURSIVE (recursive common table expressions))
   1: cockroachdb#36179  (sql: implicity convert date to timestamp)
   1: cockroachdb#36118  (sql: Cannot parse '24:00' as type time)
   1: cockroachdb#31708  (sql: support current_time)
```

Release justification: non-production change
Release note: None
jordanlewis added a commit to jordanlewis/cockroach that referenced this issue Oct 24, 2019
The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep
up to date. If we write down blacklists in code, then we can use an approach
like this to always have an up to date aggregation.

So far it seems like there's just a lot of unknowns to categorize still.

The output today:

```
=== RUN   TestBlacklists
 648: unknown                                                (unknown)
 493: cockroachdb#5807   (sql: Add support for TEMP tables)
 151: cockroachdb#17511  (sql: support stored procedures)
  86: cockroachdb#26097  (sql: make TIMETZ more pg-compatible)
  56: cockroachdb#10735  (sql: support SQL savepoints)
  55: cockroachdb#32552  (multi-dim arrays)
  55: cockroachdb#26508  (sql: restricted DDL / DML inside transactions)
  52: cockroachdb#32565  (sql: support optional TIME precision)
  39: cockroachdb#243    (roadmap: Blob storage)
  33: cockroachdb#26725  (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea))
  31: cockroachdb#27793  (sql: support custom/user-defined base scalar (primitive) types)
  24: cockroachdb#12123  (sql: Can't drop and replace a table within a transaction)
  24: cockroachdb#26443  (sql: support user-defined schemas between database and table)
  20: cockroachdb#21286  (sql: Add support for geometric types)
  18: cockroachdb#6583   (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait}))
  17: cockroachdb#22329  (Support XA distributed transactions in CockroachDB)
  16: cockroachdb#24062  (sql: 32 bit SERIAL type)
  16: cockroachdb#30352  (roadmap:when CockroachDB  will support cursor?)
  12: cockroachdb#27791  (sql: support RANGE types)
   8: cockroachdb#40195  (pgwire: multiple active result sets (portals) not supported)
   8: cockroachdb#6130   (sql: add support for key watches with notifications of changes)
   5: Expected Failure                                       (unknown)
   5: cockroachdb#23468  (sql: support sql arrays of JSONB)
   5: cockroachdb#40854  (sql: set application_name from connection string)
   4: cockroachdb#35879  (sql: `default_transaction_read_only` should also accept 'on' and 'off')
   4: cockroachdb#32610  (sql: can't insert self reference)
   4: cockroachdb#40205  (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE)
   4: cockroachdb#35897  (sql: unknown function: pg_terminate_backend())
   4: cockroachdb#4035   (sql/pgwire: missing support for row count limits in pgwire)
   3: cockroachdb#27796  (sql: support user-defined DOMAIN types)
   3: cockroachdb#3781   (sql: Add Data Type Formatting Functions)
   3: cockroachdb#40476  (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`)
   3: cockroachdb#35882  (sql: support other character sets)
   2: cockroachdb#10028  (sql: Support view queries with star expansions)
   2: cockroachdb#35807  (sql: INTERVAL output doesn't match PG)
   2: cockroachdb#35902  (sql: large object support)
   2: cockroachdb#40474  (sql: support `SELECT ... FOR UPDATE OF` syntax)
   1: cockroachdb#18846  (sql: Support CIDR column type)
   1: cockroachdb#9682   (sql: implement computed indexes)
   1: cockroachdb#31632  (sql: FK options (deferrable, etc))
   1: cockroachdb#24897  (sql: CREATE OR REPLACE VIEW)
   1: pass?                                                  (unknown)
   1: cockroachdb#36215  (sql: enable setting standard_conforming_strings to off)
   1: cockroachdb#32562  (sql: support SET LOCAL and txn-scoped session variable changes)
   1: cockroachdb#36116  (sql: psychopg: investigate how `'infinity'::timestamp` is presented)
   1: cockroachdb#26732  (sql: support the binary operator: <int> / <float>)
   1: cockroachdb#23299  (sql: support coercing string literals to arrays)
   1: cockroachdb#36115  (sql: psychopg: investigate if datetimetz is being returned instead of datetime)
   1: cockroachdb#26925  (sql: make the CockroachDB integer types more compatible with postgres)
   1: cockroachdb#21085  (sql: WITH RECURSIVE (recursive common table expressions))
   1: cockroachdb#36179  (sql: implicity convert date to timestamp)
   1: cockroachdb#36118  (sql: Cannot parse '24:00' as type time)
   1: cockroachdb#31708  (sql: support current_time)
```

Release justification: non-production change
Release note: None
craig bot pushed a commit that referenced this issue Nov 7, 2019
41252: roachtest: add test that aggregates orm blacklist failures r=jordanlewis a=jordanlewis

The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep
up to date. If we write down blacklists in code, then we can use an approach
like this to always have an up to date aggregation.

So far it seems like there's just a lot of unknowns to categorize still.

The output today:

```
=== RUN   TestBlacklists
 648: unknown                                                (unknown)
 493: #5807   (sql: Add support for TEMP tables)
 151: #17511  (sql: support stored procedures)
  86: #26097  (sql: make TIMETZ more pg-compatible)
  56: #10735  (sql: support SQL savepoints)
  55: #32552  (multi-dim arrays)
  55: #26508  (sql: restricted DDL / DML inside transactions)
  52: #32565  (sql: support optional TIME precision)
  39: #243    (roadmap: Blob storage)
  33: #26725  (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea))
  31: #27793  (sql: support custom/user-defined base scalar (primitive) types)
  24: #12123  (sql: Can't drop and replace a table within a transaction)
  24: #26443  (sql: support user-defined schemas between database and table)
  20: #21286  (sql: Add support for geometric types)
  18: #6583   (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait}))
  17: #22329  (Support XA distributed transactions in CockroachDB)
  16: #24062  (sql: 32 bit SERIAL type)
  16: #30352  (roadmap:when CockroachDB  will support cursor?)
  12: #27791  (sql: support RANGE types)
   8: #40195  (pgwire: multiple active result sets (portals) not supported)
   8: #6130   (sql: add support for key watches with notifications of changes)
   5: Expected Failure                                       (unknown)
   5: #23468  (sql: support sql arrays of JSONB)
   5: #40854  (sql: set application_name from connection string)
   4: #35879  (sql: `default_transaction_read_only` should also accept 'on' and 'off')
   4: #32610  (sql: can't insert self reference)
   4: #40205  (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE)
   4: #35897  (sql: unknown function: pg_terminate_backend())
   4: #4035   (sql/pgwire: missing support for row count limits in pgwire)
   3: #27796  (sql: support user-defined DOMAIN types)
   3: #3781   (sql: Add Data Type Formatting Functions)
   3: #40476  (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`)
   3: #35882  (sql: support other character sets)
   2: #10028  (sql: Support view queries with star expansions)
   2: #35807  (sql: INTERVAL output doesn't match PG)
   2: #35902  (sql: large object support)
   2: #40474  (sql: support `SELECT ... FOR UPDATE OF` syntax)
   1: #18846  (sql: Support CIDR column type)
   1: #9682   (sql: implement computed indexes)
   1: #31632  (sql: FK options (deferrable, etc))
   1: #24897  (sql: CREATE OR REPLACE VIEW)
   1: pass?                                                  (unknown)
   1: #36215  (sql: enable setting standard_conforming_strings to off)
   1: #32562  (sql: support SET LOCAL and txn-scoped session variable changes)
   1: #36116  (sql: psychopg: investigate how `'infinity'::timestamp` is presented)
   1: #26732  (sql: support the binary operator: <int> / <float>)
   1: #23299  (sql: support coercing string literals to arrays)
   1: #36115  (sql: psychopg: investigate if datetimetz is being returned instead of datetime)
   1: #26925  (sql: make the CockroachDB integer types more compatible with postgres)
   1: #21085  (sql: WITH RECURSIVE (recursive common table expressions))
   1: #36179  (sql: implicity convert date to timestamp)
   1: #36118  (sql: Cannot parse '24:00' as type time)
   1: #31708  (sql: support current_time)
```

Release justification: non-production change
Release note: None

Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>
@nvanbenschoten nvanbenschoten self-assigned this Jan 13, 2020
@nvanbenschoten
Copy link
Member

nvanbenschoten commented Jan 13, 2020

The issue lists four lock types, which are documented in PG's docs here. Here are a few takeways from thinking about how these will play into CRDB in the future:

  1. CRDB does not distinguish between locking on a KV's key and locking its key and value. Because of that, I think it makes sense to upgrade all FOR NO KEY [SHARE/UPDATE] variants to its corresponding FOR [SHARE/UPDATE] locking mode. This is correct per PG's lock conflict table. The only false conflict it introduces is between FOR NO KEY UPDATE and FOR KEY SHARE. I don't think this is worth worrying about, especially because of the next point.
  2. We can leave the FOR SHARE locking mode as a no-op, as CRDB's default concurrency control scheme is already similar to this. We wouldn't want to upgrade these locking modes to FOR UPGRADE and introduce synthetic conflicts between multiple FOR SHARE locking modes. This might cause deadlocks in user applications.
  3. If/when we eventually introduce read-locks at the KV level, the FOR SHARE locking mode should line up with their semantics.

In addition to supporting these locking modes, we'll also want to use the same approach that Postgres uses for using these locking modes in UPDATE and DELETE statements. From Postgres' docs:

The FOR UPDATE lock mode is also acquired by any DELETE on a row, and also by an UPDATE that modifies the values on certain columns. Currently, the set of columns considered for the UPDATE case are those that have a unique index on them that can be used in a foreign key (so partial indexes and expressional indexes are not considered), but this may change in the future.

CockroachDB implements serializability, so this is not needed for correctness, but acquiring these locks implicitly in UPDATE and DELETE statements has the potential to:

  1. reduce transaction retries
  2. improve performance by reducing thrashing and enforcing better queueing

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 15, 2020
Relates to cockroachdb#40205.

This change updates optbuilder to propagate row-level locking modes through the
*scope object to achieve the same locking clause scoping that Postgres contains.
The change propagates the locking mode all the way down to the ScanExpr level,
where a single locking mode becomes an optional property of a Scan.

With proper scoping rules, the change also improves upon the locking validation
introduced in cockroachdb#43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The
next step will be to hook the ScanExpr locking property into the execbuilder.

Release note (sql change): More invalid usages of FOR UPDATE locking clauses
are now rejected by the SQL optimizer.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 16, 2020
Relates to cockroachdb#40205.

This change updates optbuilder to propagate row-level locking modes through the
*scope object to achieve the same locking clause scoping that Postgres contains.
The change propagates the locking mode all the way down to the ScanExpr level,
where a single locking mode becomes an optional property of a Scan.

With proper scoping rules, the change also improves upon the locking validation
introduced in cockroachdb#43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The
next step will be to hook the ScanExpr locking property into the execbuilder.

Release note (sql change): More invalid usages of FOR UPDATE locking clauses
are now rejected by the SQL optimizer.
@sumeerbhola
Copy link
Collaborator

  1. CRDB does not distinguish between locking on a KV's key and locking its key and value. Because of that, I think it makes sense to upgrade all FOR NO KEY [SHARE/UPDATE] variants to its corresponding FOR [SHARE/UPDATE] locking mode. This is correct per PG's lock conflict table. The only false conflict it introduces is between FOR NO KEY UPDATE and FOR NO KEY SHARE. I don't think this is worth worrying about, especially because of the next point.

nit: I assume you meant FOR KEY SHARE.

  1. We can leave the FOR SHARE locking mode as a no-op, as CRDB's default concurrency control scheme is already similar to this. We wouldn't want to upgrade these locking modes to FOR UPGRADE and introduce synthetic conflicts between multiple FOR SHARE locking modes. This might cause deadlocks in user applications.
  2. If/when we eventually introduce read-locks at the KV level, the FOR SHARE locking mode should line up with their semantics.

I'm not clear about how not "worth worrying about" is not contradicted by bullet 3 -- if upgrading KEY SHARE to SHARE (for a transaction that is reading keys) and NO KEY UPDATE to UPDATE (for a transaction that is updating the non key columns) can introduce deadlocks when FOR SHARE is implemented with locking, isn't that something to be concerned about?

@nvanbenschoten
Copy link
Member

nit: I assume you meant FOR KEY SHARE.

Yes, updated.

isn't that something to be concerned about?

I had the same thought when writing this. There is a bit of a contradiction in this reasoning, as you pointed out. We don't want to weaken these locking modes any further because then they become no-ops and not useful to expose. However, we don't want to strengthen them because then that presents the possibility of creating unexpected deadlocks for unsuspecting users. We also likely don't want to actually support them properly as that would be a large undertaking.

The way I've been reconciling this is that in my mind there are two primary row-level locking classes (SHARE and UPDATE) and each class contains two locking modes differentiated by the KEY vs. NO KEY distinction. This secondary distinction feels like a Postgres implementation detail. For instance, there is no such secondary distinction in MySQL: https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html. It also doesn't map to CockroachDB today.

So I think we have two choices here:

  1. reject the KEY/NO KEY specifier and only accept the FOR SHARE and FOR UPDATE. This isn't a terrible option as I highly doubt people use KEY/NO KEY frequently in practice. However, it does prevent people who currently do use these specifiers from running on CRDB without modification to their code.
  2. ignore uses of the KEY/NO KEY specifier and document that it is ignored and that only the primary row-level locking classes are considered. This does create the opportunity for deadlock if a user has built their application in such a way that they are using these clauses for correctness and not just performance. However, it does not force people who are using the specifiers for performance only from needing to modify their code.

I think a strong argument could be made for both approached. @rafiss I'm curious if you have an opinion here.

Also, to add a little more context here, KEY/NO KEY were added to PG in this commit: postgres/postgres@0ac5ad5 and the motivation was primarily so that they could be used internally by PG to improve concurrency between UPDATEs that did not modify the key-portion of a row and FK checks. Interestingly, our attempt to solve the same problem is through the use of column families. To this day, PG does still occasionally ignore the KEY/NO KEY specifiers and collapse locking modes along locking classes. This is most common when interfacing with foreign data wrappers: https://github.com/postgres/postgres/blob/1281a5c907b41e992a66deb13c3aa61888a62268/contrib/postgres_fdw/deparse.c#L1274.

@rafiss
Copy link
Collaborator

rafiss commented Jan 16, 2020

I did a very unscientific Github search to see how often the various syntaxes appear in public codebases:

SELECT FROM "FOR UPDATE" - 523,512
SELECT FROM "FOR NO KEY UPDATE" - 4,602
SELECT FROM "FOR SHARE" - 112,272
SELECT FROM "FOR KEY SHARE" - 8,796

That is useful data that suggests that many people probably would not be affected by either choice. But apart from that, another consideration is to figure out if any ORMs generate queries with the KEY/NO KEY specifiers. I would guess none do it by default, and those that do support it probably see similar proportions of usage as above. For what it's worth, looking at some of the ORMs we are focusing on right now: Hibernate does not support it but has an open issue for it, and Django also does not support it but has an open issue/PR. For both of these we are creating a CockroachDB dialect/adapter, so if these issues ever do get implemented we should still be able to avoid using the KEY/NO KEY specifiers with CockroachDB, if necessary.

So I think my vote is for option 1 -- rejecting the KEY/NO KEY specifiers, and also adding telemetry on how often the unimplemented syntax is used. I worry that ignoring them (even with documentation elsewhere) will not be obvious to users, and I would claim that deadlocks happening randomly at runtime are worse than syntax errors happening deterministically at runtime. Though I'm definitely open to being convinced otherwise, and my position isn't held very strongly.

I do want other stakeholders to weigh in. My understanding is that we usually only go with the "accept the syntax but ignore it" approach if the syntax in question has no meaning in CockroachDB and causes no behavior change... which arguably does apply in this case.

@nvanbenschoten
Copy link
Member

Thanks for gathering those numbers and finding references to this in Hibernate and Django Rafi!

I also have conflicted opinions here, but I'm actually leaning towards option 2 for the following reasons:

  • after looking at some of the uses in Github, it looks like this will come up almost exclusively in ORM test suites. Rejecting these modes will prevent some tests from succeeding when they otherwise could.
  • The Github search also confirmed that there is precedence for treating these two strengths the same. See: "This is still correct but less efficient (will conflict more often than required)."
  • the two ORM discussions you posted confirm that people look at these primarily as performance optimizations.
  • I suspect that any developer who would actually depend on the subtleties of these locking modes would be more comfortable reading our documentation on it than developers who observe them being rejected and don't understand the difference between the modes.
  • It's worth keeping in mind that we do have transaction deadlock detection, so this wouldn't cause an application to deadlock indefinitely. Instead, applications would get transaction aborted errors if their use of these locking modes does cause deadlocks.

Also, keep in mind that for the next release we won't even perform any FOR SHARE locking at all, so we don't need to resolve this immediately.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 18, 2020
Relates to cockroachdb#40205.

This change updates optbuilder to propagate row-level locking modes on the stack
to achieve the same locking clause scoping that Postgres contains. The change
propagates the locking mode all the way down to the ScanExpr level, where a
single locking mode becomes an optional property of a Scan.

With proper scoping rules, the change also improves upon the locking validation
introduced in cockroachdb#43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The
next step will be to hook the ScanExpr locking property into the execbuilder.

Release note (sql change): More invalid usages of FOR UPDATE locking clauses
are now rejected by the SQL optimizer.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 18, 2020
Relates to cockroachdb#40205.

This change updates optbuilder to propagate row-level locking modes on the stack
to achieve the same locking clause scoping that Postgres contains. The change
propagates the locking mode all the way down to the ScanExpr level, where a
single locking mode becomes an optional property of a Scan.

With proper scoping rules, the change also improves upon the locking validation
introduced in cockroachdb#43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The
next step will be to hook the ScanExpr locking property into the execbuilder.

Release note (sql change): More invalid usages of FOR UPDATE locking clauses
are now rejected by the SQL optimizer.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 20, 2020
Relates to cockroachdb#40205.

This change updates optbuilder to propagate row-level locking modes on the stack
to achieve the same locking clause scoping that Postgres contains. The change
propagates the locking mode all the way down to the ScanExpr level, where a
single locking mode becomes an optional property of a Scan.

With proper scoping rules, the change also improves upon the locking validation
introduced in cockroachdb#43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The
next step will be to hook the ScanExpr locking property into the execbuilder.

Release note (sql change): More invalid usages of FOR UPDATE locking clauses
are now rejected by the SQL optimizer.
craig bot pushed a commit that referenced this issue Jan 20, 2020
44015: sql/opt/optbuilder: propagate FOR UPDATE locking in scope to ScanExprs r=nvanbenschoten a=nvanbenschoten

Relates to #40205.

This change updates optbuilder to propagate row-level locking modes through the `*scope` object to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the `ScanExpr` level, where a single locking mode becomes an optional property of a Scan. 

With proper scoping rules, the change also improves upon the locking validation introduced in #43887.

This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 30, 2020
Relates to cockroachdb#40205.

This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions`
that causes reads to throw an error if they observe an MVCC version at a
timestamp above their read timestamp. Specifically, when the option is enabled,
a `WriteTooOldError` will be returned if a scan observes an MVCC version with a
timestamp above the read. Similarly, a `WriteIntentError` will be returned if a
scan observes another transaction's intent, even if that intent has a timestamp
above the scan's read timestamp.

This option will be used in the future by `ScanRequests` and `ReverseScanRequests`
that intend to acquire unreplicated key-level locks, which will power `SELECT FOR
UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps
like traditional scans do. Instead, they want to throw errors and force their
transaction to increase its timestamp through either a refresh or a retry. This was
previously prototyped in 4a8e8dc.

Interestingly, this is not new logic to the MVCC layer. This behavior is exactly
the same as that of the initial key-value lookup performed during MVCC writes
(see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE`
would mirror that already exhibited by the read portion of read-write operations.
This also hints at an opportunity to potentially use this option to merge the two
implementations and get rid of custom logic like `mvccGetInternal` that only exists
on the write path. We'd need to be careful about doing so though, as this code is
heavily tuned.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jan 31, 2020
Relates to cockroachdb#40205.

This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions`
that causes reads to throw an error if they observe an MVCC version at a
timestamp above their read timestamp. Specifically, when the option is enabled,
a `WriteTooOldError` will be returned if a scan observes an MVCC version with a
timestamp above the read. Similarly, a `WriteIntentError` will be returned if a
scan observes another transaction's intent, even if that intent has a timestamp
above the scan's read timestamp.

This option will be used in the future by `ScanRequests` and `ReverseScanRequests`
that intend to acquire unreplicated key-level locks, which will power `SELECT FOR
UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps
like traditional scans do. Instead, they want to throw errors and force their
transaction to increase its timestamp through either a refresh or a retry. This was
previously prototyped in 4a8e8dc.

Interestingly, this is not new logic to the MVCC layer. This behavior is exactly
the same as that of the initial key-value lookup performed during MVCC writes
(see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE`
would mirror that already exhibited by the read portion of read-write operations.
This also hints at an opportunity to potentially use this option to merge the two
implementations and get rid of custom logic like `mvccGetInternal` that only exists
on the write path. We'd need to be careful about doing so though, as this code is
heavily tuned.

Release note: None
craig bot pushed a commit that referenced this issue Feb 1, 2020
44473: storage/engine: expose FailOnMoreRecent MVCC{Get/Scan}Option r=nvanbenschoten a=nvanbenschoten

Relates to #40205.

This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions` that causes reads to throw an error if they observe an MVCC version at a timestamp above their read timestamp. Specifically, when the option is enabled, a `WriteTooOldError` will be returned if a scan observes an MVCC version with a timestamp above the read. Similarly, a `WriteIntentError` will be returned if a scan observes another transaction's intent, even if that intent has a timestamp above the scan's read timestamp.

This option will be used in the future by `ScanRequests` and `ReverseScanRequests` that intend to acquire unreplicated key-level locks, which will power `SELECT FOR UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps like traditional scans do. Instead, they want to throw errors and force their transaction to increase its timestamp through either a refresh or a retry. This was previously prototyped in 4a8e8dc.

Interestingly, this is not new logic to the MVCC layer. This behavior is exactly the same as that of the initial key-value lookup performed during MVCC writes (see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE` would mirror that already exhibited by the read portion of read-write operations. This also hints at an opportunity to potentially use this option to merge the two implementations and get rid of custom logic like `mvccGetInternal` that only exists on the write path. We'd need to be careful about doing so though, as this code is heavily tuned.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 6, 2020
Relates to cockroachdb#40205.

This change adds FOR UPDATE locking mode information to the EXPLAIN
output of scanNodes. This is useful for testing that locking modes are
being correctly propagated when a FOR [KEY] UPDATE/SHARE locking clause
is specified. It will be even more useful for testing that locking clauses
are being implicitly applied to UPDATE, UPSERT, and DELETE statements when
we start performing that implicit transformation to their underlying scans.
craig bot pushed a commit that referenced this issue Feb 6, 2020
44790: sql: add FOR UPDATE locking modes to EXPLAIN output r=nvanbenschoten a=nvanbenschoten

Relates to #40205.

This change adds FOR UPDATE locking mode information to the EXPLAIN
output of scanNodes. This is useful for testing that locking modes are
being correctly propagated when a FOR [KEY] UPDATE/SHARE locking clause
is specified. It will be even more useful for testing that locking clauses
are being implicitly applied to UPDATE, UPSERT, and DELETE statements when
we start performing that implicit transformation to their underlying scans.

We currently reject both non-standard locking wait-policies, so we can't easily
test that code. I'm open to suggestions to address that, but I also don't think it's
a very big issue at the moment and I don't think it makes sense to hold off on making
the change while we're already here. If/when we let non-standard locking wait policies
through the optimizer, we'll be able to add similar tests for them.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 27, 2020
Related to cockroachdb#41720.
Related to cockroachdb#44976.

This commit integrates the new concurrency package into the storage
package. Each Replica is given a concurrency manager, which replaces
its existing latch manager and txn wait queue. The change also uses
the concurrency manager to simplify the role of the intent resolver.
The intent resolver no longer directly handles WriteIntentErrors. As
a result, we are able to delete the contention queue entirely.

With this change, all requests are now sequenced through the concurrency
manager. When sequencing, latches are acquired and conflicting locks are
detected. If any locks are found, the requests wait in lock wait-queues
for the locks to be resolved. This is a major deviation from how things
currently work because today, even with the contention queue, requests
end up waiting for conflicting transactions to commit/abort in the
txnWaitQueue after at least one RPC. Now, requests wait directly next
to the intents/locks that they are waiting on and react directly to the
resolution of these intents/locks.

Once requests are sequenced by the concurrency manager, they are
theoretically fully isolated from all conflicting requests. However,
this is not strictly true today because we have not yet pulled all
replicated locks into the concurrency manager's lock table. We will
do so in a future change. Until then, the concurrency manager maintains
a notion of "intent discovery", which is integrated into the Replica-level
concurrency retry loop.

Performance numbers will be published shortly. This will be followed
by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205)
improvements that this change enables.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 28, 2020
Related to cockroachdb#41720.
Related to cockroachdb#44976.

This commit integrates the new concurrency package into the storage
package. Each Replica is given a concurrency manager, which replaces
its existing latch manager and txn wait queue. The change also uses
the concurrency manager to simplify the role of the intent resolver.
The intent resolver no longer directly handles WriteIntentErrors. As
a result, we are able to delete the contention queue entirely.

With this change, all requests are now sequenced through the concurrency
manager. When sequencing, latches are acquired and conflicting locks are
detected. If any locks are found, the requests wait in lock wait-queues
for the locks to be resolved. This is a major deviation from how things
currently work because today, even with the contention queue, requests
end up waiting for conflicting transactions to commit/abort in the
txnWaitQueue after at least one RPC. Now, requests wait directly next
to the intents/locks that they are waiting on and react directly to the
resolution of these intents/locks.

Once requests are sequenced by the concurrency manager, they are
theoretically fully isolated from all conflicting requests. However,
this is not strictly true today because we have not yet pulled all
replicated locks into the concurrency manager's lock table. We will
do so in a future change. Until then, the concurrency manager maintains
a notion of "intent discovery", which is integrated into the Replica-level
concurrency retry loop.

Performance numbers will be published shortly. This will be followed
by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205)
improvements that this change enables.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 2, 2020
Related to cockroachdb#41720.
Related to cockroachdb#44976.

This commit integrates the new concurrency package into the storage
package. Each Replica is given a concurrency manager, which replaces
its existing latch manager and txn wait queue. The change also uses
the concurrency manager to simplify the role of the intent resolver.
The intent resolver no longer directly handles WriteIntentErrors. As
a result, we are able to delete the contention queue entirely.

With this change, all requests are now sequenced through the concurrency
manager. When sequencing, latches are acquired and conflicting locks are
detected. If any locks are found, the requests wait in lock wait-queues
for the locks to be resolved. This is a major deviation from how things
currently work because today, even with the contention queue, requests
end up waiting for conflicting transactions to commit/abort in the
txnWaitQueue after at least one RPC. Now, requests wait directly next
to the intents/locks that they are waiting on and react directly to the
resolution of these intents/locks.

Once requests are sequenced by the concurrency manager, they are
theoretically fully isolated from all conflicting requests. However,
this is not strictly true today because we have not yet pulled all
replicated locks into the concurrency manager's lock table. We will
do so in a future change. Until then, the concurrency manager maintains
a notion of "intent discovery", which is integrated into the Replica-level
concurrency retry loop.

Performance numbers will be published shortly. This will be followed
by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205)
improvements that this change enables.
craig bot pushed a commit that referenced this issue Mar 2, 2020
45269: importccl: Parallelize avro import r=miretskiy a=miretskiy

Parallelize avro importer to improve its throughput (2.8x improvement).

Touches #40374.
Fixes #45097.

Release notes (performance): Faster avro import

45482: storage: integrate Concurrency Manager into Replica request path r=nvanbenschoten a=nvanbenschoten

Related to #41720.
Related to #44976.

This commit integrates the new concurrency package into the storage package. Each Replica is given a concurrency manager, which replaces its existing latch manager and txn wait queue. The change also uses the concurrency manager to simplify the role of the intent resolver. The intent resolver no longer directly handles WriteIntentErrors. As a result, we are able to delete the contention queue entirely.

With this change, all requests are now sequenced through the concurrency manager. When sequencing, latches are acquired and conflicting locks are detected. If any locks are found, the requests wait in lock wait-queues for the locks to be resolved. This is a major deviation from how things currently work because today, even with the contention queue, requests end up waiting for conflicting transactions to commit/abort in the txnWaitQueue after at least one RPC. Now, requests wait directly next to the intents/locks that they are waiting on and react directly to the resolution of these intents/locks.

Once requests are sequenced by the concurrency manager, they are theoretically fully isolated from all conflicting requests. However, this is not strictly true today because we have not yet pulled all replicated locks into the concurrency manager's lock table. We will do so in a future change. Until then, the concurrency manager maintains a notion of "intent discovery", which is integrated into the Replica-level concurrency retry loop.

Performance numbers will be published shortly. This will be followed by performance numbers using the SELECT FOR UPDATE locking (#40205) improvements that this change enables.

45484: sql: simplify connection state machine - stop tracking retry intent r=andreimatei a=andreimatei

Before this patch, the SQL connection state machine had an optimization:
if a transaction that hadn't used "SAVEPOINT cockroach_restart"
encountered a retriable error that we can't auto-retry, then we'd
release the txn's locks eagerly and enter the Aborted state. As opposed
to transactions that had used the "SAVEPOINT cockroach_restart", which
go to RestartWait.
This optimization is a significant complication for the state machine,
so this patch is removing it. All transactions now go to RestartWait,
and wait for a ROLLBACK to release the locks.

On the flip side, doing "RELEASE SAVEPOINT cockroach_restart" and
"ROLLBACK SAVEPOINT cockroach_restart" now works even for transactions
that haven't explicitly declared that savepoint, which is nice. Although
I don't promise I'll keep it working.

Release note: None

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 4, 2020
Closes cockroachdb#40205.
Informs cockroachdb#41720.

This change teaches the KV client and the KV API about unreplicated locks.
It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which
allows their users to select the locking strength that they would like the
scan to use. This locking strength defaults to None, which corresponds to
the current behavior. However, some users will want to acquire locks on each
row scanned, which is now possible by setting the locking strength to a
stronger level. For now, only the Exclusive strength is supported.

The change then revisits SQL's row-level locking support, which is supported
all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit
(e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new
key-locking functionality in the KV API to hook up row-level locking,
completing the integration of SELECT FOR UPDATE with the KV layer and,
in particular, the new lock-table structure.

cockroachdb#43775 described the three main
benefits of this change:
- higher throughput under contention
- lower latency and improved fairness under contention
- a reduction in transaction retries under contention

I've revisited those results a few times in the last two months and seen that
the results continue to hold, and in some cases they have improved. I intend
to update this PR with a more complete analysis of its impact on those three
areas.

Release note (sql change): SELECT FOR UPDATE now hooks into a new
leaseholder-only locking mechanism. This allows the feature to be used
to improve performance of transactional that read, modify, and write
contended to rows. Similarly, UPDATE statements now use this new
mechanism by default, meaning that their performance under contention is
improved.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 4, 2020
Closes cockroachdb#40205.
Informs cockroachdb#41720.

This change teaches the KV client and the KV API about unreplicated locks.
It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which
allows their users to select the locking strength that they would like the
scan to use. This locking strength defaults to None, which corresponds to
the current behavior. However, some users will want to acquire locks on each
row scanned, which is now possible by setting the locking strength to a
stronger level. For now, only the Exclusive strength is supported.

The change then revisits SQL's row-level locking support, which is supported
all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit
(e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new
key-locking functionality in the KV API to hook up row-level locking,
completing the integration of SELECT FOR UPDATE with the KV layer and,
in particular, the new lock-table structure.

cockroachdb#43775 described the three main
benefits of this change:
- higher throughput under contention
- lower latency and improved fairness under contention
- a reduction in transaction retries under contention

I've revisited those results a few times in the last two months and seen that
the results continue to hold, and in some cases they have improved. I intend
to update this PR with a more complete analysis of its impact on those three
areas.

Release note (sql change): SELECT FOR UPDATE now hooks into a new
leaseholder-only locking mechanism. This allows the feature to be used
to improve performance of transactional that read, modify, and write
contended to rows. Similarly, UPDATE statements now use this new
mechanism by default, meaning that their performance under contention is
improved.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 6, 2020
Closes cockroachdb#40205.
Informs cockroachdb#41720.

This change teaches the KV client and the KV API about unreplicated locks.
It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which
allows their users to select the locking strength that they would like the
scan to use. This locking strength defaults to None, which corresponds to
the current behavior. However, some users will want to acquire locks on each
row scanned, which is now possible by setting the locking strength to a
stronger level. For now, only the Exclusive strength is supported.

The change then revisits SQL's row-level locking support, which is supported
all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit
(e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new
key-locking functionality in the KV API to hook up row-level locking,
completing the integration of SELECT FOR UPDATE with the KV layer and,
in particular, the new lock-table structure.

cockroachdb#43775 described the three main
benefits of this change:
- higher throughput under contention
- lower latency and improved fairness under contention
- a reduction in transaction retries under contention

I've revisited those results a few times in the last two months and seen that
the results continue to hold, and in some cases they have improved. I intend
to update this PR with a more complete analysis of its impact on those three
areas.

Release note (sql change): SELECT FOR UPDATE now hooks into a new
leaseholder-only locking mechanism. This allows the feature to be used
to improve performance of transactional that read, modify, and write
contended to rows. Similarly, UPDATE statements now use this new
mechanism by default, meaning that their performance under contention is
improved. This is only enabled for UPDATE statements that can push their
filter all the way into their key-value scan. To determine whether an
UPDATE statement is implicitly using SELECT FOR UPDATE locking, look
for a "locking strength" field in the EXPLAIN output for the statement.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 6, 2020
Closes cockroachdb#40205.
Informs cockroachdb#41720.

This change teaches the KV client and the KV API about unreplicated locks.
It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which
allows their users to select the locking strength that they would like the
scan to use. This locking strength defaults to None, which corresponds to
the current behavior. However, some users will want to acquire locks on each
row scanned, which is now possible by setting the locking strength to a
stronger level. For now, only the Exclusive strength is supported.

The change then revisits SQL's row-level locking support, which is supported
all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit
(e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new
key-locking functionality in the KV API to hook up row-level locking,
completing the integration of SELECT FOR UPDATE with the KV layer and,
in particular, the new lock-table structure.

cockroachdb#43775 described the three main
benefits of this change:
- higher throughput under contention
- lower latency and improved fairness under contention
- a reduction in transaction retries under contention

I've revisited those results a few times in the last two months and seen that
the results continue to hold, and in some cases they have improved. I intend
to update this PR with a more complete analysis of its impact on those three
areas.

Release note (sql change): SELECT FOR UPDATE now hooks into a new
leaseholder-only locking mechanism. This allows the feature to be used
to improve performance of transactions that read, modify, and write
contended to rows. Similarly, UPDATE statements now use this new
mechanism by default, meaning that their performance under contention is
improved. This is only enabled for UPDATE statements that can push their
filter all the way into their key-value scan. To determine whether an
UPDATE statement is implicitly using SELECT FOR UPDATE locking, look
for a "locking strength" field in the EXPLAIN output for the statement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants