Skip to content

Commit

Permalink
rowenc: fix splitting lookup rows into family spans in some cases
Browse files Browse the repository at this point in the history
Previously, we could incorrectly calculate whether fetching a KV for
`FamilyID==0` is needed. The zeroth KV is always present, and we rely on
it to find NULL values in columns in other column families. Before this
patch we could "optimize" the behavior to not fetch the zeroth column
family with composite non-nullable columns when we need to lookup
nullable column families (families that only contain nullable columns);
as a result, the lookup could come back empty when those columns only
have NULLs.

Consider the following example:
```
CREATE TABLE t (
  pk1 DECIMAL NOT NULL, pk2 BOOL NOT NULL, c1 INT8, c2 INT8,
  PRIMARY KEY (pk1, pk2),
  UNIQUE (pk2),
  FAMILY (c1), FAMILY (pk1, pk2, c2)
);
INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, false, 0:::INT8, NULL);
```
When the INSERT statement is evaluated, only a KV entry for the zeroth
column family is actually put into the KV layer (because the value part
of the first column family - `c2` column - is NULL).

Next, when evaluating a query `SELECT c2 FROM t WHERE (NOT pk2);`, we
first will scan the secondary unique index to fetch `1/false` primary
key. Then, we'll do an index join against the primary key to fetch `c2`.
Before this patch, we would perform a Get of `/t/pk/1/false/0/1/1`
(essentially trying to read `c2` directly of the first column family);
however, there is no such entry, so we would mistakenly think that the
row is absent and return no rows.

The problem was that we incorrectly determined the first column family
to be non-nullable, so we assumed it to always have a KV entry if a row
is present. We made that determination based on the fact that `pk1`
column (which is non-nullable, indexed, and composite) must force the
existence of that KV entry because we're using the primary index
encoding.

However, I think that reasoning is just bogus. Any column family other
than the zeroth should be considered non-nullable IFF it has
a non-nullable non-indexed column, so this patch removes all the
business about non-nullable, indexed, composite columns.

Note that this patch makes us fetch more column families, so it should
not be a correctness regression although it could be a performance
regression if my reasoning is wrong.

Release note (bug fix): Previously, CockroachDB could incorrectly not
return a row from a table with multiple column families when that row
contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED
STRING, or an arrays of such a type) is included in the PRIMARY KEY.
  • Loading branch information
yuzefovich committed Feb 15, 2022
1 parent 3be8121 commit 222e6a9
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 7 deletions.
30 changes: 30 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/select_index
Original file line number Diff line number Diff line change
Expand Up @@ -673,3 +673,33 @@ SELECT k FROM t47976 WHERE
(a > 4 OR b <= 5.23 OR c IN (1, 2, 3)) AND
(a = 12 OR b = 15.23 OR c = 14) AND
(a > 58 OR b < 0 OR c >= 13)

# Regression test for incorrectly splitting a lookup row (with NULL values not
# in the lookup key) into family spans (#76289).
statement ok
CREATE TABLE t76289_1 (
pk1 DECIMAL NOT NULL, pk2 BOOL NOT NULL, c1 INT8, c2 INT8,
PRIMARY KEY (pk1, pk2),
UNIQUE (pk2),
FAMILY (c1), FAMILY (pk1, pk2, c2)
);
INSERT INTO t76289_1 (pk1, pk2, c1) VALUES (1:::DECIMAL, false, 0:::INT8);

query I
SELECT c2 FROM t76289_1 WHERE (NOT pk2);
----
NULL

statement ok
CREATE TABLE t76289_2 (
pk1 DECIMAL NOT NULL, pk2 BOOL NOT NULL, c1 INT8, c2 INT8,
PRIMARY KEY (pk1, pk2),
INDEX (pk2),
FAMILY (c1), FAMILY (pk1, pk2, c2)
);
INSERT INTO t76289_2 (pk1, pk2, c1) VALUES (1:::DECIMAL, false, 0:::INT8);

query I
SELECT c2 FROM t76289_2@t76289_2_pk2_idx WHERE (NOT pk2);
----
NULL
12 changes: 5 additions & 7 deletions pkg/sql/rowenc/index_encoding.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,18 +286,16 @@ func NeededColumnFamilyIDs(
if nc.Contains(columnOrdinal) {
needed = true
}
if !columns[columnOrdinal].IsNullable() && (!indexedCols.Contains(columnOrdinal) ||
compositeCols.Contains(columnOrdinal) && !hasSecondaryEncoding) {
// The column is non-nullable and cannot be decoded from a different
// family, so this column family must have a KV entry for every row.
if !columns[columnOrdinal].IsNullable() && !indexedCols.Contains(columnOrdinal) {
// This column is non-nullable and is not indexed, thus, it must
// be stored in the value part of the KV entry. As a result,
// this column family is non-nullable too.
nullable = false
}
}
if needed {
neededFamilyIDs = append(neededFamilyIDs, family.ID)
if !nullable {
allFamiliesNullable = false
}
allFamiliesNullable = allFamiliesNullable && nullable
}
return nil
})
Expand Down

0 comments on commit 222e6a9

Please sign in to comment.