-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exec: make sure that AND operator pays attention to whether a value is null #40615
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rafiss, and @yuzefovich)
pkg/sql/logictest/testdata/logic_test/vectorize, line 851 at r1 (raw file):
# Test that AND'ing a true value with another true value while one of them is # actually NULL returns false.
Wait, but true and null
is supposed to return null
... the rules are confusing. It appears that if either side is false
, the result is false
, but if either side is true
the result is null.
It makes sense if you think of null
as being "unknown". if there's a false in the and
, the result is definitely false even though we don't know what the unknown value is. But if there's a true, we can't tell what the outcome will be due to the unknown value so we're forced to return null
.
I would have guessed that there is some existing logic test that would catch this. I thought we were planning to enable all the existing logic tests that support vectorized to actually run with the vectorized engine? Or did we already do that, but this case isn't tested? Either way it seems like a really good thing to do before release. |
I don't see tests for that in the logic tests. There are tests for it in the eval tests, which we don't run through the vectorized engine because it's kind of tricky to do so - those just test expressions, and the expressions will get constant folded. We should try to figure out how to run those tests via the vectorized engine somehow for sure. They're in |
7953d9c
to
17309dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the correctness issue and added unit tests. PTAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)
pkg/sql/logictest/testdata/logic_test/vectorize, line 851 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Wait, but
true and null
is supposed to returnnull
... the rules are confusing. It appears that if either side isfalse
, the result isfalse
, but if either side istrue
the result is null.It makes sense if you think of
null
as being "unknown". if there's a false in theand
, the result is definitely false even though we don't know what the unknown value is. But if there's a true, we can't tell what the outcome will be due to the unknown value so we're forced to returnnull
.
Thanks for pointing this out! Your explanation makes sense to me, my intuition was wrong. Fixed.
17309dd
to
6233dd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @yuzefovich)
pkg/sql/exec/and_tmpl.go, line 79 at r3 (raw file):
// Rule 1: at least one boolean is FALSE. outputColVals[i] = false outputNulls.UnsetNull(i)
Isn't it the case that in principle we should not need to UnsetNull
here? The contract is (or should be) that the column that a projection owns will be reset on each iteration by the batch owner. This is taken care of by ResetInternalBatch
, which unsets all nulls. I would think and hope that only the last SetNull
is needed.
pkg/sql/exec/and_tmpl.go, line 92 at r3 (raw file):
_ = rightColVals[n-1] _ = outputColVals[n-1] for i := range leftColVals[:n] {
This whole part is identical, right? Can you make it a template function? It's too much code duplication (potential for mistakes due to editing one and not the other) right now.
…s null Previously, AND operator would simply logically and two boolean vectors. However, this is incorrect if we have actual null values in those vectors. The underlying complication is that nulls are stored separately, so we need to explicitly check for them. Now this is fixed. Release note: None
This commit adds a unit test for AND operator. It also extends our testing infrastructure to be able to specify the types schema (this was needed because nil values are treated as Int64 by default which is incorrect for tests of AND operator). Release note: None
6233dd6
to
37b76c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)
pkg/sql/exec/and_tmpl.go, line 79 at r3 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Isn't it the case that in principle we should not need to
UnsetNull
here? The contract is (or should be) that the column that a projection owns will be reset on each iteration by the batch owner. This is taken care of byResetInternalBatch
, which unsets all nulls. I would think and hope that only the lastSetNull
is needed.
You're right, only the last one is required. I thought of being on the safe side given that we've had multiple issues with nulls, but I guess it might hide a problem somewhere else, so I removed the other two.
pkg/sql/exec/and_tmpl.go, line 92 at r3 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
This whole part is identical, right? Can you make it a template function? It's too much code duplication (potential for mistakes due to editing one and not the other) right now.
Good point, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 6 files at r1, 1 of 2 files at r2, 8 of 8 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)
TFTRs! bors r+ |
40615: exec: make sure that AND operator pays attention to whether a value is null r=yuzefovich a=yuzefovich **exec: make sure that AND operator pays attention to whether a value is null** Previously, AND operator would simply logically and two boolean vectors. However, this is incorrect if we have actual null values in those vectors. The underlying complication is that nulls are stored separately, so we need to explicitly check for them. Now this is fixed. **exec: add unit test for AND operator** This commit adds a unit test for AND operator. It also extends our testing infrastructure to be able to specify the types schema (this was needed because nil values are treated as Int64 by default which is incorrect for tests of AND operator). Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Build succeeded |
exec: make sure that AND operator pays attention to whether a value is null
Previously, AND operator would simply logically and two boolean vectors.
However, this is incorrect if we have actual null values in those
vectors. The underlying complication is that nulls are stored separately,
so we need to explicitly check for them. Now this is fixed.
exec: add unit test for AND operator
This commit adds a unit test for AND operator. It also extends our
testing infrastructure to be able to specify the types schema (this
was needed because nil values are treated as Int64 by default which
is incorrect for tests of AND operator).
Release note: None