Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Updates with no column references should acquire row-level locks #22994

Open
1 task done
karthik-ramanathan-3006 opened this issue Jun 24, 2024 · 0 comments
Open
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@karthik-ramanathan-3006
Copy link
Contributor

karthik-ramanathan-3006 commented Jun 24, 2024

Jira Link: DB-11915

Description

Optimizations made to UPDATE (and INSERT ON CONFLICT DO UPDATE) queries can result in DocDB requests with ybctid defined but with no columns to update. Such requests are issued for the purpose of transactional correctness.
The current DocDB behavior skips over such requests -- that is, it does not lock rows when the request has no columns to update.

This issue seeks to modify this behavior.

More details TBD.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@karthik-ramanathan-3006 karthik-ramanathan-3006 added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Jun 24, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 24, 2024
@sushantrmishra sushantrmishra removed the status/awaiting-triage Issue awaiting triage label Jul 1, 2024
karthik-ramanathan-3006 added a commit that referenced this issue Aug 1, 2024
… checks when relevant columns not modified

Summary:
**Background**
Prior to this revision, an UPDATE statement specifying a list of target columns X in its SET clause, **always** performed the necessary work to update each of the target columns in the storage layer, irrespective of whether the values of the columns actually changed. The necessary work could include requiring locks, updating indexes, checking of constraints, firing of triggers etc.

**The Optimization**
This revision introduces an optimization that validates that the values of a column are indeed being modified, before sending (flushing) the updated value of the column to the storage layer.
In particular, the set of columns whose values that are compared are those that can cause extra round trips to the storage layer in the form of:
 - Primary Key Updates
 - Secondary Index Updates
 - Foreign Key Constraints
 - Uniqueness Constraints

The matrix of columns that are marked for update and the objects (indexes, constraints) they impact are computed at planning time.
This is particularly useful when used in conjunction with prepared statements and ORMs, which tend to specify all columns (both modified and non-modified) as part of the target list.
The decision of whether a column is indeed modified is done on a per-tuple basis at execution time.

**Example**
As a concrete example, consider a table with the following schema and data:
```
yugabyte=# CREATE TABLE foo (h INT PRIMARY KEY, v1 INT, v2 INT, v3 INT);
yugabyte=# CREATE INDEX foo_v1_idx ON foo (v1);
yugabyte=# CREATE INDEX foo_v2_idx ON foo (v2);
yugabyte=# INSERT INTO foo (SELECT i, i, i % 10, i % 100 FROM generate_series(1, 10000) AS i);
```

Performing an UPDATE on the first 1000 rows (without the optimization) yields:
```
yugabyte=# SET yb_explain_hide_non_deterministic_fields TO true;
yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Update on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1)
   ->  Seq Scan on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1)
         Remote Filter: (v1 <= 1000)
         Storage Table Read Requests: 1
         Storage Table Rows Scanned: 10000
         Storage Table Write Requests: 2000
         Storage Index Write Requests: 4000
         Storage Flush Requests: 2000
 Storage Read Requests: 1
 Storage Rows Scanned: 10000
 Storage Write Requests: 6000
 Storage Flush Requests: 2001
(12 rows)
```

The values of `h` and `v1` are not modified by the query, yet result in multiple write requests to both the main table as well as the secondary indices.
Since updates to key columns (of a table or an index) is executed as a sequence of a DELETE followed by an INSERT, this query requires a large amount of flushes.
This makes the query very expensive in terms of the amount of work to be done.
With the proposed optimization the query is executed as follows:
```
yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Update on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1)
   ->  Seq Scan on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1)
         Remote Filter: (v1 <= 1000)
         Storage Table Read Requests: 1
         Storage Rows Scanned: 10000
         Storage Table Write Requests: 1000
 Storage Read Requests: 1
 Storage Rows Scanned: 10000
 Storage Write Requests: 1000
 Storage Flush Requests: 1
(10 rows)
```

**Flags and Feature Status**
This revision introduces the following GUCs to control the behavior of this optimization:
`yb_update_num_cols_to_compare` - The maximum number of columns to be compared. (default: 0)
`yb_update_max_cols_size_to_compare` - The maximum size of an individual column that can be compared. (default: 10240)

This feature is currently turned off as a result of setting `yb_update_num_cols_to_compare` to 0.

**Debuggability**
Turn on postgres debug2 logs via the following command:
```
./bin/yb-ctl restart --ysql_pg_conf_csv='log_min_messages=debug2'
```

This produces the following debug information:
```
-- At planning time
2024-07-31 10:59:07.124 PDT [76120] DEBUG: Update matrix: rows represent OID of entities, columns represent attnum of cols
2024-07-31 10:59:07.124 PDT [76120] DEBUG:  -		10
2024-07-31 10:59:07.124 PDT [76120] DEBUG:  17415	Y

-- At execution time, on a per-tuple basis
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Index/constraint with oid 17415 requires an update
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Relation: 17412	Columns that are inspected and modified: 1 (10)
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  No cols in category: Columns that are inspected and unmodified
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Relation: 17412	Columns that are marked for update: 1 (10) 2 (11)
```

**Future Work**
 1. Introduce auto-flag infrastructure to safely use row-locking. This is in the context of upgrade safety while the cluster is being upgraded.
 2. As a part of the flag infrastructure, ensure that flags/GUC values are immutable during the lifetime of a query.
 3. #22994: PGSQL_UPDATEs with no column references should acquire row locks.
 4. #23348: Add support for partitioned tables with out of order columns.
 5. Support for serializing optimization metadata in plans.
 6. Enhance randgen grammar to support ModifyTable (INSERT/UPDATE/DELETE ) queries
 7. #23350: PG 15 support.

jenkins: urgent
Jira: DB-7701

Test Plan:
Run the associated pg_regress test as follows:
```
# New tests
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule'

# Existing tests
./yb_build.sh --java-test 'org.yb.pgsql.TestPgUpdatePrimaryKey'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgUniqueConstraint'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressTrigger#testPgRegressTrigger'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressDml#testPgRegressDml'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPushdown#testPgRegressPushdown'
```

Tested scenarios include (but not limited to):
1. Single row and distributed transactions with and without the feature flag turned on.
2. Relations with a primary key and no secondary indexes or triggers (UPDATEs can take the single row path)
3. Relations with combinations of primary key and secondary indexes.
4. Relations with unconditional before-row triggers.
5. UPDATEs in Colocated databases.
6. UPDATEs covering multiple tuples.
7. Hierarchy of relations with foreign keys
8. Relations with self referential foreign keys
9. Relations with overlapping indexes.
10. Relations having columns with uniqueness constraints.
11. Relations having covering indexes.
12. Relations having partial indexes.
13. Relations having index expressions / predicates.
14. Relations with conditional column triggers.
15. Relations having indexes/constraints out of order (ie. order of columns in relation is different from that of entity)
16. Relations having combination of hash and range indexes.
17. UPDATEs with correlated subqueries.
18. INSERT ON CONFLICT DO UPDATE.
19. UPDATE RETURNING.
20. UPDATEs on temp tables.

Reviewers: mihnea, jason, amartsinchyk

Reviewed By: amartsinchyk

Subscribers: pjain, jason, smishra, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34040
karthik-ramanathan-3006 added a commit that referenced this issue Aug 20, 2024
… sec index updates and fkey checks when relevant columns not modified

Summary:
**Conflict Resolution for PG15 cherrypick**
- src/postgres/src/include/nodes/plannodes.h:
    - Location: struct ModifyTable definition:
        - My master commit adds two new fields (`yb_update_affected_entities`, `yb_skip_entities`) and subsumes (`no_update_index_list`)
        - YB-PG 15 has added new fields (`ybUseScanTupleInUpdate`, `ybHasWholeRowAttribute`)
        - Merge resolution: Keep newly added fields from both, remove `no_update_index_list`
    - Location: imports
        - My master commit adds the import `nodes/ybbitmatrix.h`
        - YB-PG 15 changes `nodes/relation.h` to `access/relation.h` and adds `nodes/parsenodes.h`, `nodes/pathnodes.h`
        - Merge resolution: Add nodes/ybbitmatrix.h, nodes/parsenodes.h, nodes/pathnodes.h and change nodes/relation.h to access/relation.h
- src/postgres/src/include/executor/executor.h
    - Location: “prototypes from functions in execIndexing.c”
        - My master commit removes the `no_update_index_list` arg from ExecInsertIndexTuples and ExecDeleteIndexTuplesOptimized and removes the function `ContainsIndexRelation`
        - YB-PG 15 has rearranged the location of function definitions, added a `ResultRelInfo` field to the *IndexTuple functions
        - Merge resolution: Keep functions in rearranged locations, add `ResultRelInfo` and remove `no_update_index_list` to ExecInsertIndexTuples, remove ExecDeleteIndexTuplesOptimized altogether, remove `ContainsIndexRelation`
- src/postgres/src/backend/utils/misc/guc.c:
    - Location: integer GUCs
        - My master commit adds new GUCs (`yb_update_num_cols_to_compare`, `yb_update_max_cols_size_to_compare`)
        - This caused an adjacency conflict with YB-PG 15 (41c091a) that changed QUERY_TUNING to QUERY_TUNING_OTHER for `yb_parallel_range_rows`.
        - Merge resolution: Add my new GUCs and change QUERY_TUNING_OTHER for `yb_parallel_range_rows`.
- src/postgres/src/include/utils/relcache.h:
    - Location: “Routines to compute/retrieve additional cached information”
        - My master commit adds a new function `YbComputeIndexExprOrPredicateAttrs`.
        - YB-PG 15 adds new functions `RelationGetIdentityKeyBitmap` (upstream PG commit: e7eea52b2d61917fbbdac7f3f895e4ef636e935b), `RelationGetIndexPredicate` and `RelationGetIndexRawAttOptions` (upstream PG commit: 911e70207703799605f5a0e8aad9f06cff067c63) causing adjacency conflicts. Further, 55782d5 moves down `CheckIndexForUpdate` declaration.
        - Merge resolution: Add all new functions, move down `CheckIndexForUpdate` declaration
- src/postgres/src/backend/utils/cache/relcache.c:
    - Location: IsProjectionFunctionalIndex
        - My master commit adds a new function `YbComputeIndexExprOrPredicateAttrs`
        - YB-PG 15 removed a function in the same area: `IsProjectionFunctionalIndex`
        - Merge resolution: Add `YbComputeIndexExprOrPredicateAttrs`, remove `IsProjectionFunctionalIndex`
    - Location: YbRelationGetFKeyReferencedByList
        - Not a merge conflict. In YB-PG 15, `DeconstructFkConstraintRow` has additional outparams to retrieve ON DELETE SET NULL/DEFAULT cols. This info is not needed, hence the params are set to NULL.
        - Not a merge conflict. Fetch the constraint oid from `Form_pg_constraint` instead of fetching it from the HeapTuple directly.
- src/postgres/src/include/commands/trigger.h:
    - Location: “in utils/adt/ri_triggers.c”
        - My master commit adds a new param `yb_skip_entities` to `RI_FKey_pk_upd_check_required` and `RI_FKey_fk_upd_check_required`
        - YB-PG 15 uses TupleTableSlots instead of HAeapTuples in the function definition + lint changes
        - Resolution: Added new param and replaced HeapTuples with TupleTableSlots
- src/postgres/src/backend/utils/adt/ri_triggers.c:
    - Same as above
- src/postgres/src/backend/commands/trigger.c:
    - Location: AfterTriggerSaveEvent
    - Same context as above
    - Additionally, YB-PG 15 skips foreign key update checks on partitioned tables (upstream PG commit: ba9a7e392171c83eb3332a757279e7088487f9a2). This behavior is retained in the conflict resolution.
- src/postgres/src/backend/optimizer/util/ybcplan.c:
    - Location: imports
        - My master commit moved around the location of `yb/yql/pggate/ybc_pggate.h` (albeit accidentally)
        - YB-PG 15 (55782d5) moves `ybcplan.h` out of the "YB includes" section to under "postgres.h"
        - Merge resolution: Retain locations of “ybc_pggate.h” and “ybcplan.h” that are already present YB-PG 15
- src/postgres/src/backend/optimizer/plan/createplan.c:
    - Location: “create_modifytable_plan”
        - My master commit adds a local variable `yb_is_single_row_update_or_delete`
        - YB-PG 15 has rearranged the location of the variable declarations
        - Merge resolution: Retain PG 15’s rearrangement, add my local variable
- src/postgres/src/backend/executor/execIndexing.c:
    - Location: ExecInsertIndexTuples
        - My master commit updated function definition as per `executor/executor.h`
        - In YB-PG 15, YB introduced function `ExecInsertIndexTuplesOptimized` was deleted during the pg15 initial merge 55782d5, causing a conflict.
        - Merge resolution: Remove `no_update_index_list` arg from ExecInsertIndexTuples and remove `ExecInsertIndexTuplesOptimized`.
    - Location: ExecInsertIndexTuples
        - PG 15 introduced the notion of hints to the storage in case an index is not modified.
        Yugabyte does not require this as this computation is already being done (and enforced).
        - Merge resolution: The indexUnchanged hint will always evaluate to false for Yugabyte relations. Added a comment explaining this decision.
    - Location: ExecDeleteIndexTuples
        - My master commit updated function definition as per `executor/executor.h`
        - In YB-PG 15, an extra parameter `resultRelInfo` is added.
        - Merge resolution: Removed `ExecDeleteIndexTuplesOptimized` and subsumed functionality into `ExecDeleteIndexTuples` as there is no longer a difference between the two implementations from a function signature point of view. Added parameter `resultRelInfo` to ExecDeleteIndexTuples
- src/postgres/src/backend/commands/copyfrom.c
    - Location: CopyFrom, CopyMultiInsertBufferFlush
        - Same context as above.
        - Merge resolution: Remove `no_update_index_list` from all invocations of `ExecInsertIndexTuples`.
- src/postgres/src/backend/executor/execReplication.c
    - Location: ExecSimpleRelationInsert, ExecSimpleRelationUpdate
        - Same context as above.
        - Merge resolution: Remove `no_update_index_list` from all invocations of `ExecInsertIndexTuples`.
- src/postgres/src/backend/executor/nodeModifyTable.c
    - Location: imports
        - My master commit adds `executor/ybOptimizeModifyTable.h`
        - YB-PG 15 moved imports from “YB includes” to “Yugabyte includes” and removed extra/unused imports
        - Merge resolution: Imports moved to “Yugabyte includes” and added `executor/ybOptimizeModifyTable.h`
    - Location: YBEqualDatums and YBBuildExtraUpdatedCols
        - Removed these functions as their functionality has moved to `executor/ybOptimizeModifyTable.h`
    - Location: ExecUpdate
        - My master commit introduces changes to compute a list of columns modified by the query for a given tuple
        - In YB-PG 15, ExecUpdate functionality has been broken into  ExecUpdatePrologue, ExecUpdateAct, ExecUpdate and ExecUpdateEpilogue.
        - Merge resolution: Moved my changes into ExecUpdateAct and ExecUpdateEpilogue.
    - Location: ExecModifyTable
        - My master commit introduces a function to compute if a tuple in an UPDATE or a DELETE query has the “wholerow” junk attribute.
        - In YB-PG 15, this logic is propagated from planning time via `plan->ybHasWholeRowAttribute`.
        - Merge resolution: Use the logic in YB-PG 15.
    - Location: YBCHasWholeRowJunkAttr
        - Same context as above.
        - Merge resolution: Remove this function
    - Location: ExecInsert
        - Merge resolution: Remove `no_update_index_list` from four invocations of `ExecInsertIndexTuples`.
    - Location: YBExecUpdateAct
        - Not a conflict, but use of`ExecMaterializeSlot` has been changed to `ExecFetchSlotHeapTuple`. It is needed to copy the slot out to a HeapTuple in order to if columns are modified by the update query.
- src/postgres/src/backend/rewrite/rewriteHandler.c
    - Location: YbAddWholeRowAttrIfNeeded
        - My master commit adds a new function `YbAddWholeRowAttrIfNeeded`
        - Conflict on PG side is upstream PG 41531e42d34f4aca117d343b5e40f3f757dec5fe and ed4653db8ca7a70ba7a4d329a44812893f8e59c2 adding code at end of rewriteValuesRTE.
        - Merge resolution: Remove all changes from YB master, retain YB-PG 15 changes. This removes functionality which will be re-evaluated and added back in a future diff.
- src/include/executor/ybOptimizeModifyTable.h
    - Location: imports
        - Not a merge conflict, added a new include `nodes/execnodes.h`
    - Location: YbComputeModifiedColumnsAndSkippableEntities
        - Added a new arg `ResultRelInfo *`  to specify the relation whose modified columns are to be computed. Previously, this info was fetched from EState which contained a single relation.
        - In YB-PG 15, Estate has been modified to include a list of relations.
- src/postgres/src/backend/executor/ybOptimizeModifyTable.c
    - Location: YBEqualDatums
        - Not a merge conflict, but initialization of `FunctionCallInfoData` changed in YB-PG 15 for call information to be variable length (reference upstream PG commit `a9c35cf85ca1ff72f16f0f10d7ddee6e582b62b8`).
        - Changed this function to use new initialization.
    - Location: YbComputeModifiedColumnsAndSkippableEntities
        - Added a new arg `ResultRelInfo *`  to specify the relation whose modified columns are to be computed. Previously, this info was fetched from EState which contained a single relation.
        - In YB-PG 15, Estate has been modified to include a list of relations.
- src/postgres/src/backend/nodes/Makefile
    - Location: OBJS
        - My master commit adds `ybbitmatrix.o` object file.
        - In YB-PG 15, the list of objects was reformatted to have one object/file per line
        - Merge resolution: Added `ybbitmatrix.o` to the head of the list, retaining the new format
- src/postgres/src/backend/executor/Makefile
    - Location: OBJS
        - Resolve as in ad2fedc for ybOptimizeModifyTable.o added by YB master

**Background**
Prior to this revision, an UPDATE statement specifying a list of target columns X in its SET clause, **always** performed the necessary work to update each of the target columns in the storage layer, irrespective of whether the values of the columns actually changed. The necessary work could include requiring locks, updating indexes, checking of constraints, firing of triggers etc.

**The Optimization**
This revision introduces an optimization that validates that the values of a column are indeed being modified, before sending (flushing) the updated value of the column to the storage layer.
In particular, the set of columns whose values that are compared are those that can cause extra round trips to the storage layer in the form of:
 - Primary Key Updates
 - Secondary Index Updates
 - Foreign Key Constraints
 - Uniqueness Constraints

The matrix of columns that are marked for update and the objects (indexes, constraints) they impact are computed at planning time.
This is particularly useful when used in conjunction with prepared statements and ORMs, which tend to specify all columns (both modified and non-modified) as part of the target list.
The decision of whether a column is indeed modified is done on a per-tuple basis at execution time.

**Example**
As a concrete example, consider a table with the following schema and data:
```
yugabyte=# CREATE TABLE foo (h INT PRIMARY KEY, v1 INT, v2 INT, v3 INT);
yugabyte=# CREATE INDEX foo_v1_idx ON foo (v1);
yugabyte=# CREATE INDEX foo_v2_idx ON foo (v2);
yugabyte=# INSERT INTO foo (SELECT i, i, i % 10, i % 100 FROM generate_series(1, 10000) AS i);
```

Performing an UPDATE on the first 1000 rows (without the optimization) yields:
```
yugabyte=# SET yb_explain_hide_non_deterministic_fields TO true;
yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Update on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1)
   ->  Seq Scan on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1)
         Remote Filter: (v1 <= 1000)
         Storage Table Read Requests: 1
         Storage Table Rows Scanned: 10000
         Storage Table Write Requests: 2000
         Storage Index Write Requests: 4000
         Storage Flush Requests: 2000
 Storage Read Requests: 1
 Storage Rows Scanned: 10000
 Storage Write Requests: 6000
 Storage Flush Requests: 2001
(12 rows)
```

The values of `h` and `v1` are not modified by the query, yet result in multiple write requests to both the main table as well as the secondary indices.
Since updates to key columns (of a table or an index) is executed as a sequence of a DELETE followed by an INSERT, this query requires a large amount of flushes.
This makes the query very expensive in terms of the amount of work to be done.
With the proposed optimization the query is executed as follows:
```
yugabyte=# EXPLAIN (ANALYZE, DIST) UPDATE foo SET h = v1, v1 = v1, v3 = v3 + 1 WHERE v1 <= 1000;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Update on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=0 loops=1)
   ->  Seq Scan on foo  (cost=0.00..105.00 rows=1000 width=88) (actual rows=1000 loops=1)
         Remote Filter: (v1 <= 1000)
         Storage Table Read Requests: 1
         Storage Rows Scanned: 10000
         Storage Table Write Requests: 1000
 Storage Read Requests: 1
 Storage Rows Scanned: 10000
 Storage Write Requests: 1000
 Storage Flush Requests: 1
(10 rows)
```

**Flags and Feature Status**
This revision introduces the following GUCs to control the behavior of this optimization:
`yb_update_num_cols_to_compare` - The maximum number of columns to be compared. (default: 0)
`yb_update_max_cols_size_to_compare` - The maximum size of an individual column that can be compared. (default: 10240)

This feature is currently turned off as a result of setting `yb_update_num_cols_to_compare` to 0.

**Debuggability**
Turn on postgres debug2 logs via the following command:
```
./bin/yb-ctl restart --ysql_pg_conf_csv='log_min_messages=debug2'
```

This produces the following debug information:
```
-- At planning time
2024-07-31 10:59:07.124 PDT [76120] DEBUG: Update matrix: rows represent OID of entities, columns represent attnum of cols
2024-07-31 10:59:07.124 PDT [76120] DEBUG:  -		10
2024-07-31 10:59:07.124 PDT [76120] DEBUG:  17415	Y

-- At execution time, on a per-tuple basis
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Index/constraint with oid 17415 requires an update
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Relation: 17412	Columns that are inspected and modified: 1 (10)
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  No cols in category: Columns that are inspected and unmodified
2024-07-31 10:59:07.143 PDT [76120] DEBUG:  Relation: 17412	Columns that are marked for update: 1 (10) 2 (11)
```

**Future Work**
 1. Introduce auto-flag infrastructure to safely use row-locking. This is in the context of upgrade safety while the cluster is being upgraded.
 2. As a part of the flag infrastructure, ensure that flags/GUC values are immutable during the lifetime of a query.
 3. #22994: PGSQL_UPDATEs with no column references should acquire row locks.
 4. #23348: Add support for partitioned tables with out of order columns.
 5. Support for serializing optimization metadata in plans.
 6. Enhance randgen grammar to support ModifyTable (INSERT/UPDATE/DELETE ) queries
 7. #23350: PG 15 support.

jenkins: urgent
Jira: DB-7701

Original commit: 63f471a / D34040

Test Plan:
Run the associated pg_regress test as follows:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule'

./yb_build.sh --java-test 'org.yb.pgsql.TestPgUpdatePrimaryKey'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgUniqueConstraint'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressTrigger#testPgRegressTrigger'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressDml#testPgRegressDml'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPushdown#testPgRegressPushdown'
```

Tested scenarios include (but not limited to):
1. Single row and distributed transactions with and without the feature flag turned on.
2. Relations with a primary key and no secondary indexes or triggers (UPDATEs can take the single row path)
3. Relations with combinations of primary key and secondary indexes.
4. Relations with unconditional before-row triggers.
5. UPDATEs in Colocated databases.
6. UPDATEs covering multiple tuples.
7. Hierarchy of relations with foreign keys
8. Relations with self referential foreign keys
9. Relations with overlapping indexes.
10. Relations having columns with uniqueness constraints.
11. Relations having covering indexes.
12. Relations having partial indexes.
13. Relations having index expressions / predicates.
14. Relations with conditional column triggers.
15. Relations having indexes/constraints out of order (ie. order of columns in relation is different from that of entity)
16. Relations having combination of hash and range indexes.
17. UPDATEs with correlated subqueries.
18. INSERT ON CONFLICT DO UPDATE.
19. UPDATE RETURNING.
20. UPDATEs on temp tables.

Reviewers: jason, tfoucher

Reviewed By: jason

Subscribers: yql, smishra, jason, pjain

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D37350
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants