-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exec: overflow handling for vectorized arithmetic #38967
exec: overflow handling for vectorized arithmetic #38967
Conversation
SUM aggregator results are not too bad overall. Most affected is multiplication projection.
|
Similar to my suggestion from yesterday, I bet we can make multiplication much faster in the typical case by skipping the overflow check when the ints are sufficiently small. Maybe something clever like checking if |
6956af2
to
9653a99
Compare
I wonder if |
Thanks for the pointers, both the casting idea and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming that test failure is expected.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @solongordon)
9653a99
to
a11b89f
Compare
I ended up changing multiplication to do something similar to Solon's idea, but I'm explicitly comparing to the upper and lower bounds so that things are a little more legible. There's a much smaller hit on latency now (although it's still substantial).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, nice. I have a suggestion to make the templates a little more legible. I'm also a little anxious that we don't have particularly great edge case testing of this still, even with the added tests from you and Matt. Is there a way we could write a quickcheck style random test that's specifically just testing edge behavior for this stuff? Or is it overkill?
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rafiss and @solongordon)
pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 363 at r2 (raw file):
panic(tree.ErrIntOutOfRange) } %[1]s = result
Did you find these numbers confusing while writing this code? As a suggestion, it might be easier to read if you used a text template:
m := map[string]interface{}{"Target": target, "L": l, "R":, r}
buf := strings.Builder{}
t := template.Must(template.New("").Parse(`
{
result := {{.L}} + {{.R}}
...`)
t.Execute(&buf, m)
return buf.Build()
(disclaimer: not tested)
pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 389 at r2 (raw file):
case 8: upperBound = "10" lowerBound = "-10"
Do we need the int8 case? Oh right, problem is that the type exists in our package, like you were saying. We should probably just delete that type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're definitely right this isn't super well tested. That failing test in the previous revision of the PR was due to a bug/typo I had in the subtraction template, and we got lucky that there was a test that caught it in an unrelated logic test. I was trying to see if I could unit test this overflow handling since there are a lot of edge cases to check, but maybe I should just go ahead and write logic tests for all of the cases.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)
pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 363 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Did you find these numbers confusing while writing this code? As a suggestion, it might be easier to read if you used a text template:
m := map[string]interface{}{"Target": target, "L": l, "R":, r} buf := strings.Builder{} t := template.Must(template.New("").Parse(` { result := {{.L}} + {{.R}} ...`) t.Execute(&buf, m) return buf.Build()
(disclaimer: not tested)
i'll look into this idea. I did find the %[1]s syntax annoying; i wish there were named format strings like in python
pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 389 at r2 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Do we need the int8 case? Oh right, problem is that the type exists in our package, like you were saying. We should probably just delete that type.
yeah, would make sense to remove int8 later
Writing unit tests seems like the way to go to me. Logic tests are very "bulky" in the sense that they take a lot of lines to do relatively little work. I think a unit test could probably manage to test quite a bit of these edge cases without as much typing. |
Release note: None
a11b89f
to
d7cff34
Compare
I updated the PR so that the overloads use templates with named arguments, and with a more robust unit test suite for all the overflow checks. Please look again :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third is the charm. Excellent work!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rafiss and @solongordon)
pkg/sql/exec/overloads_test.go, line 122 at r4 (raw file):
} func assertIntegerEquals(t *testing.T, expected, actual int) {
You can use the testify
package for these helpers if you want. assert.Equal
and assert.PanicsWithValue
.
pkg/sql/exec/execgen/cmd/execgen/overloads.go, line 414 at r4 (raw file):
{ result := {{.Left}} * {{.Right}} if {{.Left}} > {{.UpperBound}} || {{.Left}} < {{.LowerBound}} || {{.Right}} > {{.UpperBound}} || {{.Right}} < {{.LowerBound}} {
This is readable! Yay!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @rafiss, and @solongordon)
pkg/sql/exec/overloads_test.go, line 122 at r4 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
You can use the
testify
package for these helpers if you want.assert.Equal
andassert.PanicsWithValue
.
ah nice, switched to the library
The overflow checks are done as part of the code generation in overloads.go. The checks are done inline, rather than calling the functions in the arith package for performance reasons. The checks are only done for integer math. float math is already well-defined since overflow will result in +Inf and -Inf as necessary. The operations that these checks are relevant for are the SUM_INT aggregator and projection. In the future, AVG will also benefit from these overflow checks. This changes the error message produced by overflows in the non-vectorized SUM_INT aggregator so that the messages are consistent. This should be fine in terms of postgres-compatibility since SUM_INT is unique to CRDB and eventually we will get rid of it anyway. resolves cockroachdb#38775 Release note: None
d7cff34
to
1dc97ee
Compare
thanks for helping me iterate! bors r+ |
38967: exec: overflow handling for vectorized arithmetic r=rafiss a=rafiss The overflow checks are done as part of the code generation in overloads.go. The checks are done inline, rather than calling the functions in the arith package for performance reasons. The checks are only done for integer math. float math is already well-defined since overflow will result in +Inf and -Inf as necessary. The operations that these checks are relevant for are the SUM_INT aggregator and projection. In the future, AVG will also benefit from these overflow checks. This changes the error message produced by overflows in the non-vectorized SUM_INT aggregator so that the messages are consistent. This should be fine in terms of postgres-compatibility since SUM_INT is unique to CRDB and eventually we will get rid of it anyway. resolves #38775 Release note: None Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Build succeeded |
The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.
The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.
The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.
This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.
resolves #38775
Release note: None