-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: prune partial index columns and simplify partial index projections #58358
Conversation
b5b578a
to
bde9ef1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
[nit] in the commit/PR message: "neither its indexed columns not the columns referenced" -> "neither its indexed columns nor the columns referenced"; "will not require changed to the associated partial index." -> "will not require changes to the associated partial index."
Reviewed 2 of 2 files at r1, 2 of 2 files at r2, 9 of 9 files at r3, 2 of 2 files at r4, 7 of 7 files at r5, 5 of 5 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, and @RaduBerinde)
pkg/sql/opt/table_meta.go, line 285 at r3 (raw file):
// partial index expressions that have been built for a table when formatting // opt expressions. Use PartialIndexPredicate in all other cases. func (tm *TableMeta) PartialIndexPredicatesForFormattingOnly() map[cat.IndexOrdinal]ScalarExpr {
Nice name!
pkg/sql/opt/norm/mutation_funcs.go, line 37 at r6 (raw file):
updateCols.Add(tabMeta.MetaID.ColumnID(ord)) } }
Isn't there a ToSet
function on OptionalColList
that @RaduBerinde added recently to do this?
pkg/sql/opt/norm/mutation_funcs.go, line 68 at r6 (raw file):
update requires may update
^ garbled sentence
pkg/sql/opt/norm/mutation_funcs.go, line 95 at r6 (raw file):
} ord++
should this ordinal be incremented for all partial indexes, or only those that are eligible to be simplified? this seems error prone...
would be good to add some test cases with multiple partial indexes, some of which are eligible for simplification and some of which are not
pkg/sql/opt/norm/prune_cols_funcs.go, line 150 at r5 (raw file):
// update does not require changes to the index. Partial indexes may be // updated (even when a column in the index is not changing) when the // predicate reference columns that are being updated. For example, rows
[nit] reference -> references
pkg/sql/opt/norm/testdata/rules/mutation, line 62 at r6 (raw file):
# and columns referenced in predicates are not mutating. norm expect=SimplifyPartialIndexProjections UPSERT INTO t (k, a, b) VALUES (1, 2, 3)
can you add an upsert test like this but with one of the columns referenced in a predicate that has a default value that satisfies the predicate? (so therefore it would get added to the index on the insert case.)
pkg/sql/opt/norm/testdata/rules/mutation, line 62 at r6 (raw file): Previously, rytaft (Rebecca Taft) wrote…
Hmm, great point. I added this case, but it's broken because the default column is an insert column, not an update column. I think this proves that this normalization isn't possible for Upserts—the Insert columns will always contain all target table columns. I need to think about whether the prune cols changes in the 5th commit are also restricted to Updates... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @RaduBerinde, and @rytaft)
pkg/sql/opt/table_meta.go, line 285 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Nice name!
😄
pkg/sql/opt/norm/mutation_funcs.go, line 37 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Isn't there a
ToSet
function onOptionalColList
that @RaduBerinde added recently to do this?
We're collect the target table columns here, not the update column IDs. Notice how we are converting the ordinal ord
to a column id, rather than collecting col
. This is tricky, so I've expanded the comment to explain a bit more.
pkg/sql/opt/norm/mutation_funcs.go, line 68 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
update requires may update
^ garbled sentence
Done.
pkg/sql/opt/norm/mutation_funcs.go, line 95 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
should this ordinal be incremented for all partial indexes, or only those that are eligible to be simplified? this seems error prone...
would be good to add some test cases with multiple partial indexes, some of which are eligible for simplification and some of which are not
Great catch! I've expanded the tests to catch this. Added some non-partial indexes into the mix too for the hell of it. I've moved the incrementing of ord
to directly after the partial index check, which hopefully makes this less error-prone to work with in the future.
pkg/sql/opt/norm/prune_cols_funcs.go, line 150 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
[nit] reference -> references
Done.
pkg/sql/opt/norm/testdata/rules/mutation, line 62 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Hmm, great point. I added this case, but it's broken because the default column is an insert column, not an update column. I think this proves that this normalization isn't possible for Upserts—the Insert columns will always contain all target table columns. I need to think about whether the prune cols changes in the 5th commit are also restricted to Updates...
I've updated this rule so it no longer matches Upsert
s—these projections cannot be simplified because if the Upsert
results in an "insert", then the evaluated predicate expressions are required.
The prune cols changes in the 5th commit look to be fine.
2ff0060
to
952a1b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r8.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, and @RaduBerinde)
pkg/sql/opt/norm/testdata/rules/mutation, line 69 at r8 (raw file):
└── 2 [as=a_new:21] # Simplify UPDATE partial index put/del column to false of second partial index
of -> for ?
952a1b9
to
a70d7aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @RaduBerinde, and @rytaft)
pkg/sql/opt/norm/testdata/rules/mutation, line 69 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
of -> for ?
Done.
TFTR! bors r=rytaft |
Build failed: |
a70d7aa
to
e92a8a7
Compare
bors r=rytaft |
Build failed: |
We no longer derive output prune columns for Upsert, Update, and Delete ops in `DerivePruneCols`. There are no PruneCols rules for these operators, so deriving their prune columns was only performing unnecessary work. There are other rules that prune the fetch and return columns for these operators. These rules do not rely on `DerivePruneCols`. Release note: None
Previously, the `row.MakeUpdater` function had logic to determine the fetch columns required for an update operation. This is not necessary because the cost based optimizer already determines the necessary fetch columns and plumbs them to `MakeUpdater` as the `requestedCols` argument. Release note: None
Previously, partial index predicate expressions in TableMeta were the source-of-truth used within the optimizer to determine if an index is a partial index. However, partial index predicates are not added to TableMeta for all types of statements in optbuilder. Therefore, it was not safe to assume this was a source-of-truth. This commit unexports the map of partial index predicates in TableMeta. Access to partial index predicates must now be done via `TableMeta.PartialIndexPredicate`. This function checks the catalog to determine if an index is a partial index, and panics if there is not a corresponding predicate expression in the partial index predicate map. This makes the function an actual a source-of-truth. Release note: None
…ex.go Release note: None
Indexed columns of partial indexes are now only fetched for UPDATE and UPSERT operations when needed. They are pruned in cases where it is guaranteed that they are not needed to build old or new index entries. For example, consider the table and UPDATE: CREATE TABLE t ( a INT PRIMARY KEY, b INT, c INT, d INT, INDEX (b) WHERE c > 0, FAMILY (a), FAMILY (b), FAMILY (c), FAMILY (d) ) UPDATE t SET d = d + 1 WHERE a = 1 The partial index is guaranteed not to change with this UPDATE because neither its indexed columns nor the columns referenced in its predicate are mutating. Therefore, the existing values of b do not need to be fetched to maintain the state of the partial index. Furthermore, the primary index does require the existing values of b because no columns in b's family are mutating. So, b can be pruned from the UPDATE's fetch columns. Release note (performance improvement): Previously, indexed columns of partial indexes were always fetched for UPDATEs and UPSERTs. Now they are only fetched if they are required for maintaining the state of the index. If an UPDATE or UPSERT mutates columns that are neither indexed by a partial index nor referenced in a partial index predicate, they will no longer be fetched (assuming that they are not needed to maintain the state of other indexes, including the primary index).
The `SimplifyPartialIndexProjections` normalization rule has been added that normalizes synthesized partial index PUT and DEL columns to False when it is guaranteed that a mutation will not require changes to the associated partial index. This normalization can lead to further normalizations, such as pruning columns that the synthesized projections relied on. The motivation for this change is to allow fully disjoint updates to different columns in the same row, when the columns are split across different families. By pruning columns not needed to maintain a partial index, we're not forced to scan all column families. This can ultimately reduce contention during updates. Release note (performance improvement): UPDATE operations on tables with partial indexes no longer evaluate partial index predicate expressions when it is guaranteed that the operation will not alter the state of the partial index. In some cases, this can eliminate fetching the existing value of columns that are referenced in partial index predicates.
e92a8a7
to
039fb1b
Compare
bors r=rytaft |
Build succeeded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but Nice work!
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @mgartner, @RaduBerinde, and @rytaft)
pkg/sql/opt/optbuilder/partial_index.go, line 35 at r13 (raw file):
// normalize the partial index predicates. func (b *Builder) addPartialIndexPredicatesForTable( tabMeta *opt.TableMeta, scan memo.RelExpr, includeDeletable bool,
Is there any harm to always including any deletable indexes? It would be one less thing to worry about (currently the code that calls this function, and the code that uses index predicates must "match" w.r.t this).
pkg/sql/opt/optbuilder/partial_index.go, line 35 at r13 (raw file): Previously, RaduBerinde wrote…
Good point: #58714 |
Fixes #51623
opt: do not derive prune columns for Upsert, Update, Delete
We no longer derive output prune columns for Upsert, Update, and Delete
ops in
DerivePruneCols
. There are no PruneCols rules for theseoperators, so deriving their prune columns was only performing
unnecessary work. There are other rules that prune the fetch and return
columns for these operators. These rules do not rely on
DerivePruneCols
.Release note: None
sql: remove logic to determine fetch cols in row updater
Previously, the
row.MakeUpdater
function had logic to determine thefetch columns required for an update operation. This is not necessary
because the cost based optimizer already determines the necessary fetch
columns and plumbs them to
MakeUpdater
as therequestedCols
argument.
Release note: None
opt: safer access to partial index predicates in TableMeta
Previously, partial index predicate expressions in TableMeta were the
source-of-truth used within the optimizer to determine if an index is a
partial index. However, partial index predicates are not added to
TableMeta for all types of statements in optbuilder. Therefore, it was
not safe to assume this was a source-of-truth.
This commit unexports the map of partial index predicates in TableMeta.
Access to partial index predicates must now be done via
TableMeta.PartialIndexPredicate
. This function checks the catalog todetermine if an index is a partial index, and panics if there is not a
corresponding predicate expression in the partial index predicate map.
This makes the function an actual a source-of-truth.
Release note: None
opt: move addPartialIndexPredicatesForTable to optbuilder/partial_index.go
Release note: None
opt: prune update/upsert fetch columns not needed for partial indexes
Indexed columns of partial indexes are now only fetched for UPDATE and
UPSERT operations when needed. They are pruned in cases where it is
guaranteed that they are not needed to build old or new index entries.
For example, consider the table and UPDATE:
The partial index is guaranteed not to change with this UPDATE because
neither its indexed columns not the columns referenced in its predicate
are mutating. Therefore, the existing values of b do not need to be
fetched to maintain the state of the partial index. Furthermore, the
primary index does require the existing values of b because no columns
in b's family are mutating. So, b can be pruned from the UPDATE's fetch
columns.
Release note (performance improvement): Previously, indexed columns of
partial indexes were always fetched for UPDATEs and UPSERTs. Now they
are only fetched if they are required for maintaining the state of the
index. If an UPDATE or UPSERT mutates columns that are neither indexed by a
partial index nor referenced in a partial index predicate, they will no
longer be fetched (assuming that they are not needed to maintain the
state of other indexes, including the primary index).
opt: normalize partial index PUT/DEL projections to false
The
SimplifyPartialIndexProjections
normalization rule has been addedthat normalizes synthesized partial index PUT and DEL columns to False
when it is guaranteed that a mutation will not require changed to the
associated partial index. This normalization can lead to further
normalizations, such as pruning columns that the synthesized projections
relied on.
The motivation for this change is to allow fully disjoint updates to
different columns in the same row, when the columns are split across
different families. By pruning columns not needed to maintain a partial
index, we're not forced to scan all column families. This can ultimately
reduce contention during updates.
Release note (performance improvement): UPDATE and UPSERT operations on
tables with partial indexes no longer evaluate partial index predicate
expressions when it is guaranteed that the operation will not alter the
state of the partial index. In some cases, this can eliminate fetching
the existing value of columns that are referenced in partial index
predicates.