Skip to content

Commit

Permalink
[#25075] YSQL: Fix index update optimization
Browse files Browse the repository at this point in the history
Summary:
A bug was discovered with INSERT with ON CONFLICT DO UPDATE when some
secondary indexes fail to be updated and end up corrupted. This bug has multiple
causes.

The main cause is duplicate logic which makes the list of indexes unaffected by the
update. Historically this logic was a part of the function that detects if the update
statement is a single line update. It is a part of the single line update criteria: all
indexes are unaffected by update, and the list of unaffected indexes was a side
effect  of this function. Even if the statement is found not a single line this list is used
further on to skip updates of those indexes. However, the list was not always built.
It wasn't built in case of INSERT with ON CONFLICT DO UPDATE, and in some
cases the function determined that the statement is not a single line and returned
before making that list. Hence as a part of more advanced index update optimization
functionality to make a list of unaffected indexes, along of the list of indexes that
maybe unaffected per row, depending if old and new values differ. Second
implementation added entries to the unaffected list, even if primary key columns
were affected. This bug is fixed in this diff, however proper fix would be to
deduplicate the logic. We need a follow up change,
#25200 was filed to track it.

The bug was mitigated by the per-row logic, where the unaffected indexes list was
cleared if a PK column is found affected. That is the right thing to clear the list of
unaffected indexes for that row being built, but it also means the list of always
unaffected indexes should be initially empty. That diff adds an assertion check to
validate that.

Third cause was discrepancy in the logic, whether advanced index optimization
should be applied or not. In planner INSERT with ON CONFLICT DO UPDATE
allowed advanced index optimization, but not in executor, that's why the mitigation
explained in the previous paragraph did not work. Actual logic in the executor may
be simpler: apply the optimization if it is (still) allowed in the configuration and
planner prepared a data structure for such optimization. Latter basically makes the
criteria in planner the source of trues.
Jira: DB-14205

Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule'

Reviewers: kramanathan, smishra

Reviewed By: kramanathan

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D40404
  • Loading branch information
andrei-mart committed Dec 10, 2024
1 parent 0fb1cae commit 4818b71
Show file tree
Hide file tree
Showing 5 changed files with 136 additions and 16 deletions.
12 changes: 2 additions & 10 deletions src/postgres/src/backend/executor/nodeModifyTable.c
Original file line number Diff line number Diff line change
Expand Up @@ -2400,8 +2400,7 @@ yb_lreplace:;
}

ModifyTable *plan = (ModifyTable *) context->mtstate->ps.plan;
YbCopySkippableEntities(&estate->yb_skip_entities,
plan->yb_skip_entities);
YbCopySkippableEntities(&estate->yb_skip_entities, plan->yb_skip_entities);

/*
* If an update is a "single row transaction", then we have already
Expand Down Expand Up @@ -2430,13 +2429,6 @@ yb_lreplace:;
*/
*is_pk_updated = YbIsPrimaryKeyUpdated(resultRelationDesc,
*cols_marked_for_update);

/*
* TODO(alex): It probably makes more sense to pass a
* transformed slot instead of a plan slot? Note though
* that it can have tuple materialized already.
*/

if (*is_pk_updated)
{
YBCExecuteUpdateReplace(resultRelationDesc, context->planSlot, slot, estate);
Expand Down Expand Up @@ -4589,7 +4581,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
mtstate->yb_is_update_optimization_enabled =
(node->yb_update_affected_entities != NULL &&
operation == CMD_UPDATE && YbIsUpdateOptimizationEnabled());
YbIsUpdateOptimizationEnabled());

mtstate->yb_is_inplace_index_update_enabled = yb_enable_inplace_index_update;

Expand Down
11 changes: 8 additions & 3 deletions src/postgres/src/backend/executor/ybOptimizeModifyTable.c
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,11 @@ YbComputeModifiedEntities(ResultRelInfo *resultRelInfo, HeapTuple oldtuple,
if (affected_entities == NULL)
return NULL;

bool skip_entities_initially_empty PG_USED_FOR_ASSERTS_ONLY =
skip_entities->index_list == NIL &&
skip_entities->referenced_fkey_list == NIL &&
skip_entities->referencing_fkey_list == NIL;

/*
* Clone the update matrix to create a working copy for this tuple. We would
* like to preserve a clean copy for subsequent tuples in this query/plan.
Expand Down Expand Up @@ -442,12 +447,12 @@ YbComputeModifiedEntities(ResultRelInfo *resultRelInfo, HeapTuple oldtuple,
/* Mark the primary key as updated */
if (entity->oid == rel->rd_pkindex)
{
Assert(skip_entities_initially_empty);
/*
* In case the primary key is updated, the entire row must be
* deleted and re-inserted. The no update index list is not
* useful in such cases.
* deleted and re-inserted, and all references updated.
*/
skip_entities->index_list = NIL;
YbClearSkippableEntities(skip_entities);
break;
}
}
Expand Down
13 changes: 10 additions & 3 deletions src/postgres/src/backend/optimizer/util/ybcplan.c
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,7 @@ void
YbCopySkippableEntities(YbSkippableEntities *dst,
const YbSkippableEntities *src)
{
YbClearSkippableEntities(dst);
if (src == NULL)
return;

Expand Down Expand Up @@ -439,8 +440,11 @@ YbClearSkippableEntities(YbSkippableEntities *skip_entities)
return;

list_free(skip_entities->index_list);
skip_entities->index_list = NIL;
list_free(skip_entities->referencing_fkey_list);
skip_entities->referencing_fkey_list = NIL;
list_free(skip_entities->referenced_fkey_list);
skip_entities->referenced_fkey_list = NIL;
}

static AttrToColIdxMap
Expand Down Expand Up @@ -804,6 +808,7 @@ YbUpdateComputeIndexColumnReferences(const Relation rel,
YbSkippableEntities *skip_entities,
int *nentities)
{
bool pk_maybe_modified = false;
/*
* Add the primary key to the head of the entity list so that it gets
* evaluated first.
Expand All @@ -812,7 +817,6 @@ YbUpdateComputeIndexColumnReferences(const Relation rel,
{
Bitmapset *pkbms =
RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_PRIMARY_KEY);
bool pk_maybe_modified = false;
int attnum = -1;
while ((attnum = bms_next_member(pkbms, attnum)) >= 0)
{
Expand Down Expand Up @@ -895,7 +899,7 @@ YbUpdateComputeIndexColumnReferences(const Relation rel,
}
}

if (!index_maybe_modified)
if (!pk_maybe_modified && !index_maybe_modified)
YbAddEntityToSkipList(etype, index_oid, skip_entities);

RelationClose(indexDesc);
Expand All @@ -915,6 +919,9 @@ YbUpdateComputeForeignKeyColumnReferences(const Relation rel,
YbSkippableEntityType etype =
is_referencing_rel ? SKIP_REFERENCING_FKEY : SKIP_REFERENCED_FKEY;
const AttrNumber offset = YBGetFirstLowInvalidAttributeNumber(rel);
Oid pk_oid = RelationGetPrimaryKeyIndex(rel);
bool pk_maybe_modified =
!(OidIsValid(pk_oid) && list_member_oid(skip_entities->index_list, pk_oid));
ListCell *cell;
foreach (cell, fkeylist)
{
Expand All @@ -933,7 +940,7 @@ YbUpdateComputeForeignKeyColumnReferences(const Relation rel,
}
}

if (!fkey_maybe_modified)
if (!pk_maybe_modified && !fkey_maybe_modified)
YbAddEntityToSkipList(etype, fkey->conoid, skip_entities);
}
}
93 changes: 93 additions & 0 deletions src/postgres/src/test/regress/expected/yb_update_optimize_base.out
Original file line number Diff line number Diff line change
Expand Up @@ -1375,6 +1375,22 @@ EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent30' WHERE v
Storage Flush Requests: 1
(12 rows)

-- Same query, no modification, no triggers
EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent30' WHERE value = 'Eddie';
QUERY PLAN
----------------------------------------------------
Update on ancestry (actual rows=0 loops=1)
-> Seq Scan on ancestry (actual rows=1 loops=1)
Storage Filter: (value = 'Eddie'::text)
Storage Table Read Requests: 1
Storage Table Rows Scanned: 10
Storage Table Write Requests: 1
Storage Read Requests: 1
Storage Rows Scanned: 10
Storage Write Requests: 1
Storage Flush Requests: 1
(10 rows)

EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent3' WHERE key = 'gparent30';
QUERY PLAN
--------------------------------------------------------------------------
Expand Down Expand Up @@ -1521,3 +1537,80 @@ EXPLAIN (ANALYZE, DIST, COSTS OFF) EXECUTE updateplan1 (1, 1, 2, 1, 1);
DEALLOCATE updateplan1;
DROP TABLE t_simple;
DROP TABLE t_parent;
--
-- #25075: Test INSERT with UPDATE ON CONFLICT DO UPDATE
--
DROP TABLE IF EXISTS ioc_table;
NOTICE: table "ioc_table" does not exist, skipping
CREATE TABLE ioc_table (id text, uuid_col uuid, name text, primary key ((id, name), uuid_col));
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_name_id_uidx ON ioc_table (name, id);
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_name_uidx ON ioc_table (name);
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_uuid_col_uidx ON ioc_table (uuid_col);
CREATE EXTENSION "uuid-ossp";
-- insert
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
QUERY PLAN
----------------------------------------------------
Insert on ioc_table (actual rows=0 loops=1)
Conflict Resolution: UPDATE
Conflict Arbiter Indexes: ioc_table_name_id_uidx
Tuples Inserted: 1
Conflicting Tuples: 0
-> Result (actual rows=1 loops=1)
Storage Index Read Requests: 1
Storage Table Write Requests: 1
Storage Index Write Requests: 3
Storage Read Requests: 1
Storage Rows Scanned: 0
Storage Write Requests: 4
Storage Flush Requests: 1
(13 rows)

-- conflict: update
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
QUERY PLAN
----------------------------------------------------
Insert on ioc_table (actual rows=0 loops=1)
Conflict Resolution: UPDATE
Conflict Arbiter Indexes: ioc_table_name_id_uidx
Tuples Inserted: 0
Conflicting Tuples: 1
-> Result (actual rows=1 loops=1)
Storage Table Read Requests: 1
Storage Table Rows Scanned: 1
Storage Index Read Requests: 1
Storage Index Rows Scanned: 1
Storage Table Write Requests: 2
Storage Index Write Requests: 4
Storage Read Requests: 2
Storage Rows Scanned: 2
Storage Write Requests: 6
Storage Flush Requests: 1
(16 rows)

-- conflict: update again
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
QUERY PLAN
----------------------------------------------------
Insert on ioc_table (actual rows=0 loops=1)
Conflict Resolution: UPDATE
Conflict Arbiter Indexes: ioc_table_name_id_uidx
Tuples Inserted: 0
Conflicting Tuples: 1
-> Result (actual rows=1 loops=1)
Storage Table Read Requests: 1
Storage Table Rows Scanned: 1
Storage Index Read Requests: 1
Storage Index Rows Scanned: 1
Storage Table Write Requests: 2
Storage Index Write Requests: 4
Storage Read Requests: 2
Storage Rows Scanned: 2
Storage Write Requests: 6
Storage Flush Requests: 1
(16 rows)

DROP TABLE IF EXISTS ioc_table;
23 changes: 23 additions & 0 deletions src/postgres/src/test/regress/sql/yb_update_optimize_base.sql
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,8 @@ EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET parent = 'ggparent23', gp
-- Query modifying the key to the row should trigger checks in only the children
-- and grandchildren.
EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent30' WHERE value = 'Eddie';
-- Same query, no modification, no triggers
EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent30' WHERE value = 'Eddie';
EXPLAIN (ANALYZE, DIST, COSTS OFF) UPDATE ancestry SET key = 'gparent3' WHERE key = 'gparent30';

DROP TABLE ancestry;
Expand Down Expand Up @@ -387,3 +389,24 @@ DEALLOCATE updateplan1;

DROP TABLE t_simple;
DROP TABLE t_parent;

--
-- #25075: Test INSERT with UPDATE ON CONFLICT DO UPDATE
--
DROP TABLE IF EXISTS ioc_table;
CREATE TABLE ioc_table (id text, uuid_col uuid, name text, primary key ((id, name), uuid_col));
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_name_id_uidx ON ioc_table (name, id);
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_name_uidx ON ioc_table (name);
CREATE UNIQUE INDEX NONCONCURRENTLY ioc_table_uuid_col_uidx ON ioc_table (uuid_col);

CREATE EXTENSION "uuid-ossp";
-- insert
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
-- conflict: update
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
-- conflict: update again
EXPLAIN (ANALYZE, DIST, COSTS OFF) INSERT INTO ioc_table VALUES ('user-id-1-1', uuid_generate_v4(), 'Test Name 1')
ON CONFLICT (id, name) DO UPDATE SET uuid_col = EXCLUDED.uuid_col;
DROP TABLE IF EXISTS ioc_table;

0 comments on commit 4818b71

Please sign in to comment.