-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exec: add UnsetNulls to ResetInternalBatch #40593
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @jordanlewis)
pkg/sql/exec/utils_test.go, line 151 at r3 (raw file):
} if secondBatchHasNulls { assert.True(t, hasNulls(b))
more of a question for my own benefit: do you know when a test should use assert
versus require
?
the docs say that assert
will return a pass/fail bool, while require
will abort the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)
pkg/sql/exec/utils_test.go, line 151 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
more of a question for my own benefit: do you know when a test should use
assert
versusrequire
?the docs say that
assert
will return a pass/fail bool, whilerequire
will abort the test.
Good question, I'm not completely sure. I guess that it's the same difference as aborting and simply logging an error and failing the test after it completes. I always tend to use require
but kept assert
here since it was already used. It also sort of makes sense in this case to get more information for a given failure (i.e. is it only nulls or selection as well that is failing?). It's confusing that it's named assert though, I would expect something named assert to abort a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catches! Just curious: did you write a test first or did you spot the bugs first? :)
Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 2 of 2 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @jordanlewis, and @rafiss)
pkg/col/coldata/batch.go, line 176 at r3 (raw file):
m.SetSelection(false) for _, v := range m.b { v.Nulls().UnsetNulls()
I needed to do this thing for merge joiner. Could you please update mergejoiner_tmpl.go
to remove the unsetting nulls loop in there since it'll now be redundant?
pkg/sql/exec/aggregator.go, line 269 at r1 (raw file):
func (a *orderedAggregator) Next(ctx context.Context) coldata.Batch { if a.scratch.shouldResetInternalBatch { a.scratch.ResetInternalBatch()
[nit]: I'd also explicitly set a.scratch.shouldResetInternalBatch
to false
.
pkg/sql/exec/utils_test.go, line 69 at r3 (raw file):
// hasNulls is a helper function that returns whether any of the columns in b // (maybe) have nulls. func hasNulls(b coldata.Batch) bool {
[nit]: why not call it maybeHasNulls
then?
Otherwise, the overflow results that were not returned yet might be modified by downstream operators, resulting in their corruption. Release note: None
The hashJoiner was calling Next recursively, resulting in resetting hj.prober.batch twice, which could lead to corruption (by unsetting selection vectors or nulls). Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrote the test first to make sure stuff failed then added UnsetNulls
to see the difference. The hashJoiner
issue was caught due to the test but the aggregator
was another test that was failing due to unexpected results after the addition of UnsetNulls
.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rafiss, and @yuzefovich)
pkg/col/coldata/batch.go, line 176 at r3 (raw file):
Previously, yuzefovich wrote…
I needed to do this thing for merge joiner. Could you please update
mergejoiner_tmpl.go
to remove the unsetting nulls loop in there since it'll now be redundant?
Done.
pkg/sql/exec/aggregator.go, line 269 at r1 (raw file):
Previously, yuzefovich wrote…
[nit]: I'd also explicitly set
a.scratch.shouldResetInternalBatch
tofalse
.
Done.
pkg/sql/exec/utils_test.go, line 69 at r3 (raw file):
Previously, yuzefovich wrote…
[nit]: why not call it
maybeHasNulls
then?
Done.
bors r=yuzefovich,rafiss |
Build failed |
We were forgetting to UnsetNulls when resetting internal batches, possibly resulting in null results when there should be none. UnsetNulls is added to ResetInternalBatch in this commit and verifySelResets has been updated to also check for correct Null handling, as well as increasing the test coverage to also check for cases in which selection vectors or nulls should be present in the output. Release note: None
bors r=yuzefovich,rafiss |
40593: exec: add UnsetNulls to ResetInternalBatch r=yuzefovich,rafiss a=asubiotto First two commits are fixes related to unsetting nulls in ResetInternalBatch in the aggregator and hashjoiner while the last commit is the actual switch flip as well as the testing addition. Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>
Build succeeded |
First two commits are fixes related to unsetting nulls in ResetInternalBatch in the aggregator and hashjoiner while the last commit is the actual switch flip as well as the testing addition.