rowenc: fix splitting lookup rows into family spans in some cases #76563

yuzefovich · 2022-02-15T04:09:55Z

Previously, we could incorrectly calculate whether fetching a KV for
FamilyID==0 is needed. The zeroth KV is always present, and we rely on
it to find NULL values in columns in other column families. Before this
patch we could "optimize" the behavior to not fetch the zeroth column
family with composite non-nullable columns when we need to lookup
nullable column families (families that only contain nullable columns);
as a result, the lookup could come back empty when those columns only
have NULLs.

Consider the following example:

CREATE TABLE t (
  pk1 DECIMAL NOT NULL, pk2 INT8 NOT NULL, c1 INT8, c2 INT8,
  PRIMARY KEY (pk1, pk2),
  UNIQUE (pk2),
  FAMILY fam_c1 (c1), FAMILY fam_c2 (pk1, pk2, c2)
);
INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, 0:::INT8, 0:::INT8);

When the INSERT statement is evaluated, only a KV entry for fam_c1 is
actually put into the KV layer (because the value part of fam_c2 - c2
column - is NULL).

Next, when evaluating a query SELECT c2 FROM t WHERE pk2 = 0;, we
first will scan the secondary unique index to fetch /1/0 primary
key. Then, we'll do an index join against the primary key to fetch c2.
Before this patch, we would perform a Get of /t/pk/1/0/1/1 (essentially
trying to read c2 directly of fam_c2 - /1/1 suffix is the varlen
encoding of family ID 1); however, there is no such entry, so we would
mistakenly think that the row is absent and return no rows.

The problem was that we incorrectly determined fam_c2 to be
non-nullable, so we assumed it to always have a KV entry if a row is
present. We made that determination based on the fact that pk1 column
(which is non-nullable, indexed, and composite) must force the existence
of that KV entry because we're using the primary index encoding.

However, I think that reasoning is just bogus. Any column family should
be considered non-nullable IFF it has a non-nullable non-indexed column,
so this patch removes all the business about non-nullable, indexed,
composite columns. The zeroth column family is an exception that it is
always non-nullable.

Note that this patch makes us fetch more column families, so it should
not be a correctness regression although it could be a performance
regression if my reasoning is wrong.

With this change in place we now correctly perform a scan of
/t/pk/1/{0-1} span.

Fixes: #76289.

Release note (bug fix): Previously, CockroachDB could incorrectly not
return a row from a table with multiple column families when that row
contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED
STRING, or an arrays of such a type) is included in the PRIMARY KEY. For
the bug to occur composite column from the PRIMARY KEY must be included
in any column family other than the first one.

cockroach-teamcity · 2022-02-15T04:10:02Z

This change is

RaduBerinde · 2022-02-15T04:41:27Z

we rely on it to find NULL values in columns in other column families

Can you explain this a little more? (perhaps with an example)

yuzefovich

I've thought a bit more about this, and I think I mostly got a grasp of this code, so I added more expansive comment. RFAL.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)

mgartner

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)

-- commits, line 19 at r1:
This makes me wonder why we allow specifying PK columns in families - they always exist in all families.

-- commits, line 30 at r1:
I don't understand why the trailing /1/1 exists in this key.

pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):

		if needed {
			neededFamilyIDs = append(neededFamilyIDs, family.ID)
			allFamiliesNullable = allFamiliesNullable && nullable

is it worth adding multi-column-family tests in TestIndexKey in index_encoding_test.go for this logic?

RaduBerinde

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)

-- commits, line 19 at r1:
Let's give these column families names e.g. fam_c1, fam_c2 and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.

-- commits, line 30 at r1:

Previously, mgartner (Marcus Gartner) wrote…

I don't understand why the trailing /1/1 exists in this key.

I think it should be /t/pk/1/false/1 no? (or we could use a given fam_c1 name instead of the ID)

pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):

// retrieve neededCols for the specified table and index. The returned descpb.FamilyIDs
// are in sorted order.
func NeededColumnFamilyIDs(

If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.

pkg/sql/rowenc/index_encoding.go, line 290 at r1 (raw file):

			}
			if !columns[columnOrdinal].IsNullable() && !indexedCols.Contains(columnOrdinal) {
				// This column is non-nullable and is not indexed, thus, it must

👍

Previously, we could incorrectly calculate whether fetching a KV for `FamilyID==0` is needed. The zeroth KV is always present, and we rely on it to find NULL values in columns in other column families. Before this patch we could "optimize" the behavior to not fetch the zeroth column family with composite non-nullable columns when we need to lookup nullable column families (families that only contain nullable columns); as a result, the lookup could come back empty when those columns only have NULLs. Consider the following example: ``` CREATE TABLE t ( pk1 DECIMAL NOT NULL, pk2 INT8 NOT NULL, c1 INT8, c2 INT8, PRIMARY KEY (pk1, pk2), UNIQUE (pk2), FAMILY fam_c1 (c1), FAMILY fam_c2 (pk1, pk2, c2) ); INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, 0:::INT8, 0:::INT8); ``` When the INSERT statement is evaluated, only a KV entry for `fam_c1` is actually put into the KV layer (because the value part of `fam_c2` - `c2` column - is NULL). Next, when evaluating a query `SELECT c2 FROM t WHERE pk2 = 0;`, we first will scan the secondary unique index to fetch `/1/0` primary key. Then, we'll do an index join against the primary key to fetch `c2`. Before this patch, we would perform a Get of `/t/pk/1/0/1/1` (essentially trying to read `c2` directly of `fam_c2` - `/1/1` suffix is the varlen encoding of family ID 1); however, there is no such entry, so we would mistakenly think that the row is absent and return no rows. The problem was that we incorrectly determined `fam_c2` to be non-nullable, so we assumed it to always have a KV entry if a row is present. We made that determination based on the fact that `pk1` column (which is non-nullable, indexed, and composite) must force the existence of that KV entry because we're using the primary index encoding. However, I think that reasoning is just bogus. Any column family should be considered non-nullable IFF it has a non-nullable non-indexed column, so this patch removes all the business about non-nullable, indexed, composite columns. The zeroth column family is an exception that it is always non-nullable. Note that this patch makes us fetch more column families, so it should not be a correctness regression although it could be a performance regression if my reasoning is wrong. With this change in place we now correctly perform a scan of `/t/pk/1/{0-1}` span. Release note (bug fix): Previously, CockroachDB could incorrectly not return a row from a table with multiple column families when that row contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED STRING, or an arrays of such a type) is included in the PRIMARY KEY. For the bug to occur composite column from the PRIMARY KEY must be included in any column family other than the first one.

yuzefovich

TFTRs!

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @RaduBerinde)

-- commits, line 19 at r1:

Previously, mgartner (Marcus Gartner) wrote…

This makes me wonder why we allow specifying PK columns in families - they always exist in all families.

That's a good point, filed #76604.

-- commits, line 19 at r1:

Previously, RaduBerinde wrote…

Let's give these column families names e.g. fam_c1, fam_c2 and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.

Good point, reworded.

-- commits, line 30 at r1:

Previously, RaduBerinde wrote…

I think it should be /t/pk/1/false/1 no? (or we could use a given fam_c1 name instead of the ID)

I was slightly wrong - it should have been /t/pk/1/false/1/1. /1/1 suffix is the varlen encoding of family ID 1, so the format is /table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/.

pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):

Previously, RaduBerinde wrote…

If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.

I agree. It seems a bit involved to add those unit tests, so I filed #76623 and will add them during the stability period.

pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

is it worth adding multi-column-family tests in TestIndexKey in index_encoding_test.go for this logic?

TestIndexKey is too high-level for this. I agree with you and Radu though that we should have some unit tests for this function.

RaduBerinde

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)

-- commits, line 30 at r1:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I was slightly wrong - it should have been /t/pk/1/false/1/1. /1/1 suffix is the varlen encoding of family ID 1, so the format is /table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/.

I'm confused.. this "pretty key" representation shows the values after decoding the various components, it shouldn't matter how it is encoded. You are right that they show up like that (https://gist.github.com/RaduBerinde/82a8088c20c3df449dd4a8d9d76aef89) but I don't understand why.

craig · 2022-02-15T23:28:02Z

Build succeeded:

GitHub CI (Cockroach)

blathers-crl · 2022-02-15T23:28:10Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from e4e8b51 to blathers/backport-release-21.1-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.

error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

dt · 2022-02-16T00:26:11Z

blathers backport 21.2

blathers-crl · 2022-02-16T00:26:15Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2 failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

yuzefovich force-pushed the secondary-index-bug branch from 5150336 to 222e6a9 Compare February 15, 2022 06:14

yuzefovich marked this pull request as ready for review February 15, 2022 06:14

yuzefovich requested review from RaduBerinde, mgartner and a team February 15, 2022 06:14

yuzefovich commented Feb 15, 2022

View reviewed changes

yuzefovich added backport-21.1.x labels Feb 15, 2022

mgartner approved these changes Feb 15, 2022

View reviewed changes

RaduBerinde approved these changes Feb 15, 2022

View reviewed changes

yuzefovich force-pushed the secondary-index-bug branch from 222e6a9 to e4e8b51 Compare February 15, 2022 22:04

yuzefovich commented Feb 15, 2022

View reviewed changes

RaduBerinde reviewed Feb 15, 2022

View reviewed changes

craig bot merged commit ffd8b81 into cockroachdb:master Feb 15, 2022

yuzefovich deleted the secondary-index-bug branch February 16, 2022 01:31

This was referenced Feb 16, 2022

release-21.2: rowenc: fix splitting lookup rows into family spans in some cases #76636

Merged

release-21.1: rowenc: fix splitting lookup rows into family spans in some cases #76637

Merged

yuzefovich mentioned this pull request Sep 19, 2022

rowenc: fix needed column families computation for secondary indexes #88174

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rowenc: fix splitting lookup rows into family spans in some cases #76563

rowenc: fix splitting lookup rows into family spans in some cases #76563

yuzefovich commented Feb 15, 2022 •

edited

Loading

cockroach-teamcity commented Feb 15, 2022

RaduBerinde commented Feb 15, 2022

yuzefovich left a comment

mgartner left a comment

RaduBerinde left a comment

yuzefovich left a comment

RaduBerinde left a comment

craig bot commented Feb 15, 2022

blathers-crl bot commented Feb 15, 2022

dt commented Feb 16, 2022

blathers-crl bot commented Feb 16, 2022

rowenc: fix splitting lookup rows into family spans in some cases #76563

rowenc: fix splitting lookup rows into family spans in some cases #76563

Conversation

yuzefovich commented Feb 15, 2022 • edited Loading

cockroach-teamcity commented Feb 15, 2022

RaduBerinde commented Feb 15, 2022

yuzefovich left a comment

Choose a reason for hiding this comment

mgartner left a comment

Choose a reason for hiding this comment

RaduBerinde left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

RaduBerinde left a comment

Choose a reason for hiding this comment

craig bot commented Feb 15, 2022

blathers-crl bot commented Feb 15, 2022

dt commented Feb 16, 2022

blathers-crl bot commented Feb 16, 2022

yuzefovich commented Feb 15, 2022 •

edited

Loading