-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowenc: fix splitting lookup rows into family spans in some cases #76563
Conversation
Can you explain this a little more? (perhaps with an example) |
5150336
to
222e6a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought a bit more about this, and I think I mostly got a grasp of this code, so I added more expansive comment. RFAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)
-- commits, line 19 at r1:
This makes me wonder why we allow specifying PK columns in families - they always exist in all families.
-- commits, line 30 at r1:
I don't understand why the trailing /1/1
exists in this key.
pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):
if needed { neededFamilyIDs = append(neededFamilyIDs, family.ID) allFamiliesNullable = allFamiliesNullable && nullable
is it worth adding multi-column-family tests in TestIndexKey
in index_encoding_test.go
for this logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)
-- commits, line 19 at r1:
Let's give these column families names e.g. fam_c1
, fam_c2
and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.
Previously, mgartner (Marcus Gartner) wrote…
I don't understand why the trailing
/1/1
exists in this key.
I think it should be /t/pk/1/false/1
no? (or we could use a given fam_c1
name instead of the ID)
pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):
// retrieve neededCols for the specified table and index. The returned descpb.FamilyIDs // are in sorted order. func NeededColumnFamilyIDs(
If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.
pkg/sql/rowenc/index_encoding.go, line 290 at r1 (raw file):
} if !columns[columnOrdinal].IsNullable() && !indexedCols.Contains(columnOrdinal) { // This column is non-nullable and is not indexed, thus, it must
👍
Previously, we could incorrectly calculate whether fetching a KV for `FamilyID==0` is needed. The zeroth KV is always present, and we rely on it to find NULL values in columns in other column families. Before this patch we could "optimize" the behavior to not fetch the zeroth column family with composite non-nullable columns when we need to lookup nullable column families (families that only contain nullable columns); as a result, the lookup could come back empty when those columns only have NULLs. Consider the following example: ``` CREATE TABLE t ( pk1 DECIMAL NOT NULL, pk2 INT8 NOT NULL, c1 INT8, c2 INT8, PRIMARY KEY (pk1, pk2), UNIQUE (pk2), FAMILY fam_c1 (c1), FAMILY fam_c2 (pk1, pk2, c2) ); INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, 0:::INT8, 0:::INT8); ``` When the INSERT statement is evaluated, only a KV entry for `fam_c1` is actually put into the KV layer (because the value part of `fam_c2` - `c2` column - is NULL). Next, when evaluating a query `SELECT c2 FROM t WHERE pk2 = 0;`, we first will scan the secondary unique index to fetch `/1/0` primary key. Then, we'll do an index join against the primary key to fetch `c2`. Before this patch, we would perform a Get of `/t/pk/1/0/1/1` (essentially trying to read `c2` directly of `fam_c2` - `/1/1` suffix is the varlen encoding of family ID 1); however, there is no such entry, so we would mistakenly think that the row is absent and return no rows. The problem was that we incorrectly determined `fam_c2` to be non-nullable, so we assumed it to always have a KV entry if a row is present. We made that determination based on the fact that `pk1` column (which is non-nullable, indexed, and composite) must force the existence of that KV entry because we're using the primary index encoding. However, I think that reasoning is just bogus. Any column family should be considered non-nullable IFF it has a non-nullable non-indexed column, so this patch removes all the business about non-nullable, indexed, composite columns. The zeroth column family is an exception that it is always non-nullable. Note that this patch makes us fetch more column families, so it should not be a correctness regression although it could be a performance regression if my reasoning is wrong. With this change in place we now correctly perform a scan of `/t/pk/1/{0-1}` span. Release note (bug fix): Previously, CockroachDB could incorrectly not return a row from a table with multiple column families when that row contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED STRING, or an arrays of such a type) is included in the PRIMARY KEY. For the bug to occur composite column from the PRIMARY KEY must be included in any column family other than the first one.
222e6a9
to
e4e8b51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @RaduBerinde)
Previously, mgartner (Marcus Gartner) wrote…
This makes me wonder why we allow specifying PK columns in families - they always exist in all families.
That's a good point, filed #76604.
Previously, RaduBerinde wrote…
Let's give these column families names e.g.
fam_c1
,fam_c2
and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.
Good point, reworded.
Previously, RaduBerinde wrote…
I think it should be
/t/pk/1/false/1
no? (or we could use a givenfam_c1
name instead of the ID)
I was slightly wrong - it should have been /t/pk/1/false/1/1
. /1/1
suffix is the varlen encoding of family ID 1, so the format is /table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/
.
pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):
Previously, RaduBerinde wrote…
If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.
I agree. It seems a bit involved to add those unit tests, so I filed #76623 and will add them during the stability period.
pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
is it worth adding multi-column-family tests in
TestIndexKey
inindex_encoding_test.go
for this logic?
TestIndexKey
is too high-level for this. I agree with you and Radu though that we should have some unit tests for this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I was slightly wrong - it should have been
/t/pk/1/false/1/1
./1/1
suffix is the varlen encoding of family ID 1, so the format is/table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/
.
I'm confused.. this "pretty key" representation shows the values after decoding the various components, it shouldn't matter how it is encoded. You are right that they show up like that (https://gist.github.com/RaduBerinde/82a8088c20c3df449dd4a8d9d76aef89) but I don't understand why.
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from e4e8b51 to blathers/backport-release-21.1-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1.x failed. See errors above. error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
blathers backport 21.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Previously, we could incorrectly calculate whether fetching a KV for
FamilyID==0
is needed. The zeroth KV is always present, and we rely onit to find NULL values in columns in other column families. Before this
patch we could "optimize" the behavior to not fetch the zeroth column
family with composite non-nullable columns when we need to lookup
nullable column families (families that only contain nullable columns);
as a result, the lookup could come back empty when those columns only
have NULLs.
Consider the following example:
When the INSERT statement is evaluated, only a KV entry for
fam_c1
isactually put into the KV layer (because the value part of
fam_c2
-c2
column - is NULL).
Next, when evaluating a query
SELECT c2 FROM t WHERE pk2 = 0;
, wefirst will scan the secondary unique index to fetch
/1/0
primarykey. Then, we'll do an index join against the primary key to fetch
c2
.Before this patch, we would perform a Get of
/t/pk/1/0/1/1
(essentiallytrying to read
c2
directly offam_c2
-/1/1
suffix is the varlenencoding of family ID 1); however, there is no such entry, so we would
mistakenly think that the row is absent and return no rows.
The problem was that we incorrectly determined
fam_c2
to benon-nullable, so we assumed it to always have a KV entry if a row is
present. We made that determination based on the fact that
pk1
column(which is non-nullable, indexed, and composite) must force the existence
of that KV entry because we're using the primary index encoding.
However, I think that reasoning is just bogus. Any column family should
be considered non-nullable IFF it has a non-nullable non-indexed column,
so this patch removes all the business about non-nullable, indexed,
composite columns. The zeroth column family is an exception that it is
always non-nullable.
Note that this patch makes us fetch more column families, so it should
not be a correctness regression although it could be a performance
regression if my reasoning is wrong.
With this change in place we now correctly perform a scan of
/t/pk/1/{0-1}
span.Fixes: #76289.
Release note (bug fix): Previously, CockroachDB could incorrectly not
return a row from a table with multiple column families when that row
contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED
STRING, or an arrays of such a type) is included in the PRIMARY KEY. For
the bug to occur composite column from the PRIMARY KEY must be included
in any column family other than the first one.