Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rowenc: fix splitting lookup rows into family spans in some cases #76563

Merged
merged 1 commit into from
Feb 15, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Feb 15, 2022

Previously, we could incorrectly calculate whether fetching a KV for
FamilyID==0 is needed. The zeroth KV is always present, and we rely on
it to find NULL values in columns in other column families. Before this
patch we could "optimize" the behavior to not fetch the zeroth column
family with composite non-nullable columns when we need to lookup
nullable column families (families that only contain nullable columns);
as a result, the lookup could come back empty when those columns only
have NULLs.

Consider the following example:

CREATE TABLE t (
  pk1 DECIMAL NOT NULL, pk2 INT8 NOT NULL, c1 INT8, c2 INT8,
  PRIMARY KEY (pk1, pk2),
  UNIQUE (pk2),
  FAMILY fam_c1 (c1), FAMILY fam_c2 (pk1, pk2, c2)
);
INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, 0:::INT8, 0:::INT8);

When the INSERT statement is evaluated, only a KV entry for fam_c1 is
actually put into the KV layer (because the value part of fam_c2 - c2
column - is NULL).

Next, when evaluating a query SELECT c2 FROM t WHERE pk2 = 0;, we
first will scan the secondary unique index to fetch /1/0 primary
key. Then, we'll do an index join against the primary key to fetch c2.
Before this patch, we would perform a Get of /t/pk/1/0/1/1 (essentially
trying to read c2 directly of fam_c2 - /1/1 suffix is the varlen
encoding of family ID 1); however, there is no such entry, so we would
mistakenly think that the row is absent and return no rows.

The problem was that we incorrectly determined fam_c2 to be
non-nullable, so we assumed it to always have a KV entry if a row is
present. We made that determination based on the fact that pk1 column
(which is non-nullable, indexed, and composite) must force the existence
of that KV entry because we're using the primary index encoding.

However, I think that reasoning is just bogus. Any column family should
be considered non-nullable IFF it has a non-nullable non-indexed column,
so this patch removes all the business about non-nullable, indexed,
composite columns. The zeroth column family is an exception that it is
always non-nullable.

Note that this patch makes us fetch more column families, so it should
not be a correctness regression although it could be a performance
regression if my reasoning is wrong.

With this change in place we now correctly perform a scan of
/t/pk/1/{0-1} span.

Fixes: #76289.

Release note (bug fix): Previously, CockroachDB could incorrectly not
return a row from a table with multiple column families when that row
contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED
STRING, or an arrays of such a type) is included in the PRIMARY KEY. For
the bug to occur composite column from the PRIMARY KEY must be included
in any column family other than the first one.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member

we rely on it to find NULL values in columns in other column families

Can you explain this a little more? (perhaps with an example)

@yuzefovich yuzefovich marked this pull request as ready for review February 15, 2022 06:14
@yuzefovich yuzefovich requested review from RaduBerinde, mgartner and a team February 15, 2022 06:14
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought a bit more about this, and I think I mostly got a grasp of this code, so I added more expansive comment. RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)


-- commits, line 19 at r1:
This makes me wonder why we allow specifying PK columns in families - they always exist in all families.


-- commits, line 30 at r1:
I don't understand why the trailing /1/1 exists in this key.


pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):

		if needed {
			neededFamilyIDs = append(neededFamilyIDs, family.ID)
			allFamiliesNullable = allFamiliesNullable && nullable

is it worth adding multi-column-family tests in TestIndexKey in index_encoding_test.go for this logic?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)


-- commits, line 19 at r1:
Let's give these column families names e.g. fam_c1, fam_c2 and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.


-- commits, line 30 at r1:

Previously, mgartner (Marcus Gartner) wrote…

I don't understand why the trailing /1/1 exists in this key.

I think it should be /t/pk/1/false/1 no? (or we could use a given fam_c1 name instead of the ID)


pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):

// retrieve neededCols for the specified table and index. The returned descpb.FamilyIDs
// are in sorted order.
func NeededColumnFamilyIDs(

If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.


pkg/sql/rowenc/index_encoding.go, line 290 at r1 (raw file):

			}
			if !columns[columnOrdinal].IsNullable() && !indexedCols.Contains(columnOrdinal) {
				// This column is non-nullable and is not indexed, thus, it must

👍

Previously, we could incorrectly calculate whether fetching a KV for
`FamilyID==0` is needed. The zeroth KV is always present, and we rely on
it to find NULL values in columns in other column families. Before this
patch we could "optimize" the behavior to not fetch the zeroth column
family with composite non-nullable columns when we need to lookup
nullable column families (families that only contain nullable columns);
as a result, the lookup could come back empty when those columns only
have NULLs.

Consider the following example:
```
CREATE TABLE t (
  pk1 DECIMAL NOT NULL, pk2 INT8 NOT NULL, c1 INT8, c2 INT8,
  PRIMARY KEY (pk1, pk2),
  UNIQUE (pk2),
  FAMILY fam_c1 (c1), FAMILY fam_c2 (pk1, pk2, c2)
);
INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, 0:::INT8, 0:::INT8);
```
When the INSERT statement is evaluated, only a KV entry for `fam_c1` is
actually put into the KV layer (because the value part of `fam_c2`  - `c2`
column - is NULL).

Next, when evaluating a query `SELECT c2 FROM t WHERE pk2 = 0;`, we
first will scan the secondary unique index to fetch `/1/0` primary
key. Then, we'll do an index join against the primary key to fetch `c2`.
Before this patch, we would perform a Get of `/t/pk/1/0/1/1` (essentially
trying to read `c2` directly of `fam_c2` - `/1/1` suffix is the varlen
encoding of family ID 1); however, there is no such entry, so we would
mistakenly think that the row is absent and return no rows.

The problem was that we incorrectly determined `fam_c2` to be
non-nullable, so we assumed it to always have a KV entry if a row is
present. We made that determination based on the fact that `pk1` column
(which is non-nullable, indexed, and composite) must force the existence
of that KV entry because we're using the primary index encoding.

However, I think that reasoning is just bogus. Any column family should
be considered non-nullable IFF it has a non-nullable non-indexed column,
so this patch removes all the business about non-nullable, indexed,
composite columns. The zeroth column family is an exception that it is
always non-nullable.

Note that this patch makes us fetch more column families, so it should
not be a correctness regression although it could be a performance
regression if my reasoning is wrong.

With this change in place we now correctly perform a scan of
`/t/pk/1/{0-1}` span.

Release note (bug fix): Previously, CockroachDB could incorrectly not
return a row from a table with multiple column families when that row
contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED
STRING, or an arrays of such a type) is included in the PRIMARY KEY. For
the bug to occur composite column from the PRIMARY KEY must be included
in any column family other than the first one.
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @RaduBerinde)


-- commits, line 19 at r1:

Previously, mgartner (Marcus Gartner) wrote…

This makes me wonder why we allow specifying PK columns in families - they always exist in all families.

That's a good point, filed #76604.


-- commits, line 19 at r1:

Previously, RaduBerinde wrote…

Let's give these column families names e.g. fam_c1, fam_c2 and refer to them by name. Some of the text below uses "first column family" when I think it actually means "column family 1" which is the second one.

Good point, reworded.


-- commits, line 30 at r1:

Previously, RaduBerinde wrote…

I think it should be /t/pk/1/false/1 no? (or we could use a given fam_c1 name instead of the ID)

I was slightly wrong - it should have been /t/pk/1/false/1/1. /1/1 suffix is the varlen encoding of family ID 1, so the format is /table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/.


pkg/sql/rowenc/index_encoding.go, line 188 at r1 (raw file):

Previously, RaduBerinde wrote…

If you have some time, it would be worthwhile to add a unit test for this. It has non-trivial logic and it's not great to only have high-level logic tests covering it.

I agree. It seems a bit involved to add those unit tests, so I filed #76623 and will add them during the stability period.


pkg/sql/rowenc/index_encoding.go, line 298 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

is it worth adding multi-column-family tests in TestIndexKey in index_encoding_test.go for this logic?

TestIndexKey is too high-level for this. I agree with you and Radu though that we should have some unit tests for this function.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)


-- commits, line 30 at r1:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I was slightly wrong - it should have been /t/pk/1/false/1/1. /1/1 suffix is the varlen encoding of family ID 1, so the format is /table_name/index_name/pk1_value/pk2_value/family_id_varlen_encoded/.

I'm confused.. this "pretty key" representation shows the values after decoding the various components, it shouldn't matter how it is encoded. You are right that they show up like that (https://gist.github.com/RaduBerinde/82a8088c20c3df449dd4a8d9d76aef89) but I don't understand why.

@craig
Copy link
Contributor

craig bot commented Feb 15, 2022

Build succeeded:

@craig craig bot merged commit ffd8b81 into cockroachdb:master Feb 15, 2022
@blathers-crl
Copy link

blathers-crl bot commented Feb 15, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from e4e8b51 to blathers/backport-release-21.1-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.


error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@dt
Copy link
Member

dt commented Feb 16, 2022

blathers backport 21.2

@blathers-crl
Copy link

blathers-crl bot commented Feb 16, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from e4e8b51 to blathers/backport-release-21.2-76563: POST https://api.github.com/repos/cockroachlabs/cockroach/merges: 403 Resource not accessible by integration []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: incorrect splitting lookup span into family spans
5 participants