-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Support READ COMMITTED isolation level semantics for DMLs #9468
Labels
Comments
This was referenced Jul 26, 2021
pkj415
added a commit
to pkj415/yugabyte-db
that referenced
this issue
Nov 11, 2021
Summary: Support initial part of READ COMMITTED isolation level by using a new ConsistentReadPoint for every query in a read committed txn. The read point will be set to the current hybrid time on the txn manager (i.e., postgres) Test Plan: ./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation ./yb_build.sh --java-test org.yb.pgsql.TestPgTransparentRestarts Reviewers: dmitry Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D13794
pkj415
changed the title
[YSQL] Support READ COMMITTED isolation level with optimistic locking
[YSQL] Support READ COMMITTED isolation level
Nov 11, 2021
pkj415
added a commit
that referenced
this issue
Nov 11, 2021
Summary: Support initial part of READ COMMITTED isolation level by using a new ConsistentReadPoint for every statement in a read committed txn. The read point will be set to the current hybrid time on the txn manager (i.e., postgres). The feature is guarded under the tserver gflag yb_enable_read_committed_isolation - 1. A false value implies the existing behaviour of internally mapping "read committed" to "repeatable read". 2. A true value means that we treat "read committed" as a separate isolation level with the correct expected semantics. NOTE: To ensure we don't reset read point to current time in case we are in a kReadRestart retry, a new field "recently_restarted_read_point_" is introduced in consistent_read_point.h. (An alternate solution involves two steps - i) not restarting read point during restart wrapper and then ii) restarting in StartTransactionCommand(). This leads to too much code change and complexity. The new field helps get rid of that complexity). Test Plan: Jenkins: urgent ./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation ./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadCommittedEnabledEnvVarCaching ./yb_build.sh --java-test org.yb.pgsql.TestPgTransparentRestarts Reviewers: kgupta, smishra, dsrinivasan, alex, mihnea Reviewed By: mihnea Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D13794
pkj415
added a commit
that referenced
this issue
Nov 11, 2021
Summary: Support initial part of READ COMMITTED isolation level by using a new ConsistentReadPoint for every statement in a read committed txn. The read point will be set to the current hybrid time on the txn manager (i.e., postgres). The feature is guarded under the tserver gflag yb_enable_read_committed_isolation - 1. A false value implies the existing behaviour of internally mapping "read committed" to "repeatable read". 2. A true value means that we treat "read committed" as a separate isolation level with the correct expected semantics. NOTE: To ensure we don't reset read point to current time in case we are in a kReadRestart retry, a new field "recently_restarted_read_point_" is introduced in consistent_read_point.h. (An alternate solution involves two steps - i) not restarting read point during restart wrapper and then ii) restarting in StartTransactionCommand(). This leads to too much code change and complexity. The new field helps get rid of that complexity). Original diff: https://phabricator.dev.yugabyte.com/D13794, 0e0dcde Test Plan: Jenkins: urgent, rebase: 2.11.0 ./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation ./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadCommittedEnabledEnvVarCaching ./yb_build.sh --java-test org.yb.pgsql.TestPgTransparentRestarts Reviewers: dsrinivasan, mihnea Reviewed By: mihnea Subscribers: jenkins-bot, yql Differential Revision: https://phabricator.dev.yugabyte.com/D13880
pkj415
added a commit
to pkj415/yugabyte-db
that referenced
this issue
Dec 2, 2021
pkj415
added a commit
that referenced
this issue
Dec 28, 2021
…tement in READ COMMITTED isolation (Part-2) Summary: For a REPEATABLE READ isolation transaction, the read point is picked as the current hybrid time and the transaction's reads are supposed to include all other transactions' data that committed before this transaction was issued. Any transaction that committed before this transaction, might have a commit time of at most this txn's start time (as seen on the txn manager) + max clock skew. This upper bound of max clock skew + transaction start time is called global_limit. To be precise, we use the read point (chosen based on current hybrid time) as the txn start time to compute the upper bound of glolal_limit. Data written by other txns with commit time within global_limit can be of two types - (1) the data committed before this transaction was issued but still has a commit time greater than read point. But since we want to read everything committed before this txn was issued, one way to proceed is to transparently shift the read point of the txn to a read point >= the commit time of such data, re-read at the tablet server and send the new read point to the query layer for using as the txn read point. But in case the txn had already read some data using the old read point from other tablet servers, transparent shifting of the read point is not possible at the tablet server and a kReadRestart error is thrown to the query layer. (2) the data committed after this transaction was issued, and can be ignored. But there is no way to differentiate between such data and data of type 1. So the tablet server takes a conservative approach and tries to either transparently shift the read point as explained above, or if that isn't possible, throws a kReadRestart error to the query layer. There is an optimization that still helps us differentiate between type 1 and type 2 in a special case and avoid shifting read point/ throwing kReadRestart: the committed data has a commit time within global_limit but the tablet server knows that the commit was after this txn was issued because the corresponding intents were written to intents db after the current transaction was issued (based on optimization in c784595). This is checked by comparing the encoded intent write time in committed entries with the local_limit of the txn. local_limit is a per tablet server limit and chosen as the safe time on the first rpc to the tablet server as part of a txn. So, barring the optimization, if a tablet server sees data ahead of read point but within global_limit, it will either transparently shift the read point ahead and inform the query layer about it (if no earlier reads have been performed by the query layer at other tablet servers) (OR) throw a kReadRestart error to the query layer. A tablet server might find data with commit time that is within global_limit either in regular db or in intents db (if the data was written as part of a recently committed txn whose intents are yet to move to regular db). Apart from the transparent shifting of read point, a second level of transparent retries exists the query layer: In case the tablet server throws a kReadRestart error, the query layer either forwards it to the external client or inturn transparently retries the whole txn if no response data has been sent to the client, by picking a later read point called "restart read point" that is later enough to include all such commit times of transactions with data of type 1/2. Retry at a "restart read point" isn't possible if any data has been sent to the client as part of the txn because that might change older response data. Note that transparent retries, if at all, can only be done in the first statement since completion of a statement results in surely sending some data to the external client. One fact to note is: the first read of a key can result in kReadRestart, but further reads of the same key via later rpcs to the same tablet server can result in kReadRestart only due to data with commit time after read point but within global_limit such that - (1) it was committed after the first rpc. Because if it had committed before the first rpc, a kReadRestart would have been thrown in the first rpc resulting in the new restart read point to be >= all commit timestamps as seen in the first rpc. (2) the intent of the key was written as part of the committed txn before the first rpc (to be precise, before the local_limit that is picked as part of the first rpc). This is because, other txns' committed data with intents written after the local_limit won't result in kReadRestart given the optimization above. In other words, narrower requirements are to be met for a kReadRestart to occur in read of a key after the first read. So, chances of a kReadRestart due to read of a given key, decrease after the first rpc that reads that key. In a READ COMMITTED isolation transaction, a new read point is picked for each statement based on the current hybrid time. Each statement is supposed to include all transactions that commit before the statement is issued. Due to this, all of the above discussion now applies on a per statement level. Even the fact above changes to: '"for each statement", the first read of a key can result in kReadRestart, but further reads of the same key....'. This increases the chances of kReadRestart in a single transaction. This can be resolved by transparently retrying kReadRestart errors for each statement in the query layer in case no data has been sent to the client for that statement (we are allowed to do this in READ COMMITTED instead of worrying if any data has been sent to the client for the txn because data sent before this statement had an older read point and hence older responses wouldn't change). Test Plan: ./yb_build.sh --java-test org.yb.pgsql.TestPgTransparentRestarts Reviewers: alex Reviewed By: alex Subscribers: sergei, mihnea, yql Differential Revision: https://phabricator.dev.yugabyte.com/D14397
pkj415
added a commit
that referenced
this issue
Jan 10, 2022
pkj415
added a commit
to pkj415/yugabyte-db
that referenced
this issue
Feb 18, 2022
…n READ COMMITTED isolation (Part-3) Summary: In this third part, we ensure that we don't throw kConflict errors to external ysql clients when using READ COMMITTED isolation level. We do this by - 1. Re-executing a statement when kConflict is seen: this is done by leveraging savepoints. An internal savepoint is created before execution of every statement, which is rolled back to on facing a kConflict. This helps get rid of any provisional writes that where written by the statement before the conflict and hence are no longer valid. The statement is retried indefinitely till statement timeout with configurable exponential backoff. This gives a feeling that pessimistic locking is also in place. Note that we also lazily rely only on the statement timeout to get rid of deadlocks, without proactively detecting them with a distributed deadlock detection algorithm. That will be come in as a separate improvement with pessimistic locking. 2. Using the highest priority for READ COMMITTED txns: this helps ensure that no other txns can abort a txn. Test Plan: Enabled Postgres's existing eval-plan-qual isolation test with appropriate modifications for disable cases that require features yet to be implemented on YB. Added a bunch of new tests from the functional spec as well: src/test/isolation/specs/yb_pb_eval-plan-qual.spec src/test/isolation/specs/yb_read_committed_insert.spec src/test/isolation/specs/yb_read_committed_test_internal_savepoint.spec src/test/isolation/specs/yb_read_committed_update_and_explicit_locking.spec Reviewers: mihnea, alex, rsami Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15383
pkj415
added a commit
that referenced
this issue
Feb 24, 2022
…OMMITTED isolation (Part-3) Summary: In this third part, we ensure that we don't throw kConflict errors to external ysql clients when using READ COMMITTED isolation level. We do this by - (1) Re-executing a statement when kConflict is seen: this is done by leveraging savepoints. An internal savepoint is created before execution of every statement, which is rolled back to on facing a kConflict. This helps get rid of any provisional writes that where written by the statement before the conflict and hence are no longer valid. The statement is retried indefinitely until statement timeout with configurable exponential backoff. This gives a feeling that pessimistic locking is also in place. Note that we also lazily rely only on the statement timeout to get rid of deadlocks, without proactively detecting them with a distributed deadlock detection algorithm. That will be come in as a separate improvement with pessimistic locking. (2) Using the highest priority for READ COMMITTED txns: this helps ensure that no other txns can abort a READ COMMITTED txn. Even other READ COMMITTED txns can't. Test Plan: Jenkins: urgent Enabled Postgres's existing eval-plan-qual isolation test with appropriate modifications to disable cases that require features yet to be implemented on YB. Added a bunch of new tests from the functional spec as well: src/test/isolation/specs/yb_pb_eval-plan-qual.spec src/test/isolation/specs/yb_read_committed_insert.spec src/test/isolation/specs/yb_read_committed_test_internal_savepoint.spec src/test/isolation/specs/yb_read_committed_update_and_explicit_locking.spec Reviewers: mihnea, alex, rsami, mtakahara Reviewed By: rsami, mtakahara Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15383
pkj415
added a commit
that referenced
this issue
Feb 25, 2022
…errors in READ COMMITTED isolation (Part-3) Summary: In this third part, we ensure that we don't throw kConflict errors to external ysql clients when using READ COMMITTED isolation level. We do this by - (1) Re-executing a statement when kConflict is seen: this is done by leveraging savepoints. An internal savepoint is created before execution of every statement, which is rolled back to on facing a kConflict. This helps get rid of any provisional writes that where written by the statement before the conflict and hence are no longer valid. The statement is retried indefinitely until statement timeout with configurable exponential backoff. This gives a feeling that pessimistic locking is also in place. Note that we also lazily rely only on the statement timeout to get rid of deadlocks, without proactively detecting them with a distributed deadlock detection algorithm. That will be come in as a separate improvement with pessimistic locking. (2) Using the highest priority for READ COMMITTED txns: this helps ensure that no other txns can abort a READ COMMITTED txn. Even other READ COMMITTED txns can't. Original commit: https://phabricator.dev.yugabyte.com/D15383, TBD Test Plan: Jenklins: urgent, rebase: 2.12 Enabled Postgres's existing eval-plan-qual isolation test with appropriate modifications to disable cases that require features yet to be implemented on YB. Added a bunch of new tests from the functional spec as well: src/test/isolation/specs/yb_pb_eval-plan-qual.spec src/test/isolation/specs/yb_read_committed_insert.spec src/test/isolation/specs/yb_read_committed_test_internal_savepoint.spec src/test/isolation/specs/yb_read_committed_update_and_explicit_locking.spec Reviewers: mihnea, alex, rsami, mtakahara Reviewed By: mtakahara Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15571
jayant07-yb
pushed a commit
to jayant07-yb/yugabyte-db
that referenced
this issue
Mar 8, 2022
…n READ COMMITTED isolation (Part-3) Summary: In this third part, we ensure that we don't throw kConflict errors to external ysql clients when using READ COMMITTED isolation level. We do this by - (1) Re-executing a statement when kConflict is seen: this is done by leveraging savepoints. An internal savepoint is created before execution of every statement, which is rolled back to on facing a kConflict. This helps get rid of any provisional writes that where written by the statement before the conflict and hence are no longer valid. The statement is retried indefinitely until statement timeout with configurable exponential backoff. This gives a feeling that pessimistic locking is also in place. Note that we also lazily rely only on the statement timeout to get rid of deadlocks, without proactively detecting them with a distributed deadlock detection algorithm. That will be come in as a separate improvement with pessimistic locking. (2) Using the highest priority for READ COMMITTED txns: this helps ensure that no other txns can abort a READ COMMITTED txn. Even other READ COMMITTED txns can't. Test Plan: Jenkins: urgent Enabled Postgres's existing eval-plan-qual isolation test with appropriate modifications to disable cases that require features yet to be implemented on YB. Added a bunch of new tests from the functional spec as well: src/test/isolation/specs/yb_pb_eval-plan-qual.spec src/test/isolation/specs/yb_read_committed_insert.spec src/test/isolation/specs/yb_read_committed_test_internal_savepoint.spec src/test/isolation/specs/yb_read_committed_update_and_explicit_locking.spec Reviewers: mihnea, alex, rsami, mtakahara Reviewed By: rsami, mtakahara Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D15383
pkj415
changed the title
[YSQL] Support READ COMMITTED isolation level
[YSQL] Support READ COMMITTED isolation level semantics for DMLs
Aug 9, 2022
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
Aug 9, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Jira Link: DB-3139
Functional spec for full feature - https://docs.google.com/document/d/1bayBT9H0acTFJPLAGcaO5B3MKI57Nra_oDxchYKIFTI
Design doc - https://docs.google.com/document/d/1yqnYJDYjotQXBzhe0kwxfB7c-PmAsrDM-Rs6uAVRgOU/edit
The text was updated successfully, but these errors were encountered: