Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
rowenc: fix splitting lookup rows into family spans in some cases
Previously, we could incorrectly calculate whether fetching a KV for `FamilyID==0` is needed. The zeroth KV is always present, and we rely on it to find NULL values in columns in other column families. Before this patch we could "optimize" the behavior to not fetch the zeroth column family with composite non-nullable columns when we need to lookup nullable column families (families that only contain nullable columns); as a result, the lookup could come back empty when those columns only have NULLs. Consider the following example: ``` CREATE TABLE t ( pk1 DECIMAL NOT NULL, pk2 BOOL NOT NULL, c1 INT8, c2 INT8, PRIMARY KEY (pk1, pk2), UNIQUE (pk2), FAMILY (c1), FAMILY (pk1, pk2, c2) ); INSERT INTO t (pk1, pk2, c1, c2) VALUES (1:::DECIMAL, false, 0:::INT8, NULL); ``` When the INSERT statement is evaluated, only a KV entry for the zeroth column family is actually put into the KV layer (because the value part of the first column family - `c2` column - is NULL). Next, when evaluating a query `SELECT c2 FROM t WHERE (NOT pk2);`, we first will scan the secondary unique index to fetch `1/false` primary key. Then, we'll do an index join against the primary key to fetch `c2`. Before this patch, we would perform a Get of `/t/pk/1/false/0/1/1` (essentially trying to read `c2` directly of the first column family); however, there is no such entry, so we would mistakenly think that the row is absent and return no rows. The problem was that we incorrectly determined the first column family to be non-nullable, so we assumed it to always have a KV entry if a row is present. We made that determination based on the fact that `pk1` column (which is non-nullable, indexed, and composite) must force the existence of that KV entry because we're using the primary index encoding. However, I think that reasoning is just bogus. Any column family other than the zeroth should be considered non-nullable IFF it has a non-nullable non-indexed column, so this patch removes all the business about non-nullable, indexed, composite columns. Note that this patch makes us fetch more column families, so it should not be a correctness regression although it could be a performance regression if my reasoning is wrong. Release note (bug fix): Previously, CockroachDB could incorrectly not return a row from a table with multiple column families when that row contains a NULL value when a composite type (FLOAT, DECIMAL, COLLATED STRING, or an arrays of such a type) is included in the PRIMARY KEY.
- Loading branch information