sql: add width_bucket builtin #39263

kevinbarbour · 2019-08-02T14:20:19Z

Implements the width_bucket() builtin function

Details on the Postgres implementation can be found here:
https://www.postgresql.org/docs/11/functions-math.html

Resolves #38855

Release note (sql change): add the width_bucket builtin function.

cockroach-teamcity · 2019-08-02T14:20:26Z

This change is

CLAassistant · 2019-08-02T14:20:26Z

All committers have signed the CLA.

jordanlewis

Nice - thanks @barbourkd for the contribution.

As far as the multiple overloads thing goes, I do think you should add one for floats as well. You should be able to see that it doesn't work if you do select width_bucket(a,b,c,d) from t where a, b, c are floats and not decimals.

Could you add some logic tests that exercise the SQL builtins themselves as well? You can add them to pkg/sql/logictest/testdata/logic_test/builtin_function (follow the patterns to understand the test format) and run that test file with make testbaselogic FILES=builtin_function.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @barbourkd)

pkg/sql/sem/builtins/builtins.go, line 2374 at r1 (raw file):

				for i, v := range thresholds.Array {
					if operand.Compare(ctx, v) < 0 {

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

Either way, definitely try to test this in the logic test file I mentioned above.

pkg/sql/sem/builtins/builtins.go, line 4693 at r1 (raw file):

}

func widthBucket(operand float64, b1 float64, b2 float64, count int) int {

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.

kevinbarbour · 2019-08-02T15:06:33Z

Thanks for the notes on this @jordanlewis

I do think you should add one for floats as well. You should be able to see that it doesn't work if you do select width_bucket(a,b,c,d) from t where a, b, c are floats and not decimals.

I'll test that out and create another overload to handle floats.

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.
Could you add some logic tests that exercise the SQL builtins themselves as well?

👍

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

In this case the datum Compare func does throw an error. Will add a check and error condition to ensure the types are consistent.

kevinbarbour · 2019-08-02T16:03:22Z

Adding a float overload causes some new issues. Before adding the float overload this worked, as it just treated the integers as decimals:

root@127.0.0.1:65297/startrek> select width_bucket(5, 1, 10, 5);
  width_bucket
+--------------+
             3
(1 row)

After adding a new overload for floats it no longer infers the ints as decimals:

root@127.0.0.1:65313/startrek> select width_bucket(5, 1, 10, 5);
pq: unknown signature: width_bucket(int, int, int, int)

That could be resolved by adding an Integer overload but then we are unable to do this:

root@127.0.0.1:65313/startrek> select width_bucket(5, 1.1, 10.0, 5);
pq: unknown signature: width_bucket(int, decimal, decimal, int)

Any thoughts on the best way to handle this?

kevinbarbour

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)

pkg/sql/sem/builtins/builtins.go, line 2374 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

Either way, definitely try to test this in the logic test file I mentioned above.

Done - added error condition and a check in the logic tests.

pkg/sql/sem/builtins/builtins.go, line 4693 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.

Done.

kevinbarbour

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)

pkg/sql/sem/builtins/builtins.go, line 2351 at r3 (raw file):

	),

	"width_bucket": makeBuiltin(defProps(),

@jordanlewis Put in overloads for int, float, and decimal. Pretty stumped on why this doesn't allow a mix of integer and decimal literals based on how other built-ins behave. Not sure if that's a use case that is worth trying to support or not.

Using the div built-in as an example, the following div overloads exist:

div(int, int)
div(float, float)
div(decimal, decimal)

when you run div(5, 4.1) it works - it seems to just act as is the 5 is a decimal value and call the decimal overload. There is no div(int, decimal) overload.

Yet with the following width_bucket overloads:

width_bucket(int, int, int, int)
width_bucket(float, float, float, int)
width_bucket(decimal, decimal, decimal, int)

when you run width_bucket(1, 1.0, 10.0, 5) it throws this:
unknown signature: width_bucket(int, decimal, decimal, int)

jordanlewis

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @barbourkd and @jordanlewis)

pkg/sql/sem/builtins/builtins.go, line 2351 at r3 (raw file):

Previously, barbourkd (Kevin) wrote…

@jordanlewis Put in overloads for int, float, and decimal. Pretty stumped on why this doesn't allow a mix of integer and decimal literals based on how other built-ins behave. Not sure if that's a use case that is worth trying to support or not.

Using the div built-in as an example, the following div overloads exist:
div(int, int)
div(float, float)
div(decimal, decimal)
when you run div(5, 4.1) it works - it seems to just act as is the 5 is a decimal value and call the decimal overload. There is no div(int, decimal) overload.

Yet with the following width_bucket overloads:
width_bucket(int, int, int, int)
width_bucket(float, float, float, int)
width_bucket(decimal, decimal, decimal, int)
when you run width_bucket(1, 1.0, 10.0, 5) it throws this:
unknown signature: width_bucket(int, decimal, decimal, int)

Hm, agreed that this is fairly suspicious. I would think that you'd only need one for float and one for decimal. I'll give this a closer look in a bit.

kevinbarbour · 2019-08-07T13:26:17Z

@jordanlewis planning to spend a little more time digging into this today - you had a chance to take a look at it or have any thoughts on where I should focus trying to figure out what's going on?

jordanlewis · 2019-08-07T13:34:35Z

I unfortunately haven't had any time to try to dig into this. My guess is that there's a deeper issue with overload resolution with mixed type arguments. I recall there's some logic that acts differently depending on whether the overload has fully homogeneous types, which isn't the case here.

If you want you could try to look into that. I'll warn you that that code is fairly tricky and you'd probably need to add a bunch of prints or use a debugger to understand what's going on in this case.

In the end, it might not be necessary to have a float overload at all - it's a nice to have - but it's a bummer the overload stuff is getting in our way here.

kevinbarbour · 2019-08-08T16:29:07Z

You weren't lying about this overload code being tricky - took me a little bit, but eventually tracked down the path it's taking through the overload resolution logic. Your guess was correct - after a certain point the overload resolution code just tries to make all the parameters homogenous and in this case they are not. I assume that we're a little outside the scope of the issue at this point, but here's what I figured out:

Say that we make it into the overload heuristics with a signature of width_bucket(decimal, int, int, int) and possible overloads of width_bucket(decimal, decimal, decimal, int) and width_bucket(float, float, float, int). It then runs through the first 3 heuristics trying to eliminate all but one signature and fails - since they are both equally valid. We get to the fourth heuristic and still have both signatures as possibilities.

// The fourth heuristic is to prefer candidates that accepts the "best"
// mutual type in the resolvable type set of all constants.

Basically at this point it determines the "best mutual type" between all of the parameters and then eliminates any overloads that won't accept homogenous parameters of that type. In our case the "best common type" of (decimal, int, int, int) is Decimal. It then eliminates both of the width_bucket signatures because neither of them can accept (decimal, decimal, decimal, decimal) parameters - leaving us with no possible overloads.

Gurio · 2019-08-22T15:36:08Z

@jordanlewis could you approve/merge?

jordanlewis · 2019-08-22T15:45:41Z

Sorry about the delay here. Let's merge the version without the float support. We can do better here later.

kevinbarbour · 2019-08-29T11:17:25Z

Sounds good, I’ll pull the float overload out and get the PR updated later today.

Implements the width_bucket() builtin function Details on the Postgres implementation can be found here: https://www.postgresql.org/docs/11/functions-math.html Resolves cockroachdb#38855 Release note (sql change): add the width_bucket builtin function.

kevinbarbour · 2019-08-29T14:55:59Z

@jordanlewis Float overload has been removed.

jordanlewis · 2019-08-29T15:20:13Z

Thanks - looks like you have a merge conflict. If you could fix that that would be great. Also, I know you probably have gotten busy with other things - if you're out of bandwidth, feel free to check the "maintainers can push" box on this PR and I can fix up the remaining small issues.

kevinbarbour · 2019-08-29T15:26:37Z

Conflict resolved.

kevinbarbour · 2019-08-29T16:26:52Z

@jordanlewis Not sure what's going on with the failed build. Build is successful on my local machine.

jordanlewis · 2019-08-30T00:31:55Z

Okay, thanks @barbourkd!

bors r+

39263: sql: add width_bucket builtin r=jordanlewis a=barbourkd Implements the width_bucket() builtin function Details on the Postgres implementation can be found here: https://www.postgresql.org/docs/11/functions-math.html Resolves #38855 Release note (sql change): add the width_bucket builtin function. Co-authored-by: Kevin Barbour <barbourkd@vcu.edu> Co-authored-by: Kevin <kevinbarbourd@gmail.com>

craig · 2019-08-30T01:01:24Z

Build succeeded

GitHub CI (Cockroach)

kevinbarbour requested a review from a team August 2, 2019 14:20

kevinbarbour mentioned this pull request Aug 2, 2019

builtins: add width_bucket builtin function #38855

Closed

kevinbarbour force-pushed the width_bucket branch from 5256420 to 18375f8 Compare August 2, 2019 14:27

jordanlewis reviewed Aug 2, 2019

View reviewed changes

kevinbarbour force-pushed the width_bucket branch from 18375f8 to b62320a Compare August 2, 2019 16:42

kevinbarbour commented Aug 2, 2019

View reviewed changes

kevinbarbour force-pushed the width_bucket branch from b62320a to d64c8e1 Compare August 2, 2019 18:56

kevinbarbour commented Aug 2, 2019

View reviewed changes

jordanlewis reviewed Aug 2, 2019

View reviewed changes

sql: add width_bucket builtin

48599ad

Implements the width_bucket() builtin function Details on the Postgres implementation can be found here: https://www.postgresql.org/docs/11/functions-math.html Resolves cockroachdb#38855 Release note (sql change): add the width_bucket builtin function.

kevinbarbour force-pushed the width_bucket branch from d64c8e1 to 48599ad Compare August 29, 2019 14:54

Merge branch 'master' into width_bucket

3ff89d5

craig bot merged commit 3ff89d5 into cockroachdb:master Aug 30, 2019

jseldess mentioned this pull request Sep 5, 2019

sql: add width_bucket builtin cockroachdb/docs#5373

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: add width_bucket builtin #39263

sql: add width_bucket builtin #39263

kevinbarbour commented Aug 2, 2019

cockroach-teamcity commented Aug 2, 2019

CLAassistant commented Aug 2, 2019 •

edited

Loading

jordanlewis left a comment

kevinbarbour commented Aug 2, 2019

kevinbarbour commented Aug 2, 2019

kevinbarbour left a comment

kevinbarbour left a comment

jordanlewis left a comment

kevinbarbour commented Aug 7, 2019

jordanlewis commented Aug 7, 2019

kevinbarbour commented Aug 8, 2019

Gurio commented Aug 22, 2019

jordanlewis commented Aug 22, 2019

kevinbarbour commented Aug 29, 2019 via email •

edited

Loading

kevinbarbour commented Aug 29, 2019

jordanlewis commented Aug 29, 2019

kevinbarbour commented Aug 29, 2019

kevinbarbour commented Aug 29, 2019

jordanlewis commented Aug 30, 2019

craig bot commented Aug 30, 2019

sql: add width_bucket builtin #39263

sql: add width_bucket builtin #39263

Conversation

kevinbarbour commented Aug 2, 2019

cockroach-teamcity commented Aug 2, 2019

CLAassistant commented Aug 2, 2019 • edited Loading

jordanlewis left a comment

Choose a reason for hiding this comment

kevinbarbour commented Aug 2, 2019

kevinbarbour commented Aug 2, 2019

kevinbarbour left a comment

Choose a reason for hiding this comment

kevinbarbour left a comment

Choose a reason for hiding this comment

jordanlewis left a comment

Choose a reason for hiding this comment

kevinbarbour commented Aug 7, 2019

jordanlewis commented Aug 7, 2019

kevinbarbour commented Aug 8, 2019

Gurio commented Aug 22, 2019

jordanlewis commented Aug 22, 2019

kevinbarbour commented Aug 29, 2019 via email • edited Loading

kevinbarbour commented Aug 29, 2019

jordanlewis commented Aug 29, 2019

kevinbarbour commented Aug 29, 2019

kevinbarbour commented Aug 29, 2019

jordanlewis commented Aug 30, 2019

craig bot commented Aug 30, 2019

Build succeeded

CLAassistant commented Aug 2, 2019 •

edited

Loading

kevinbarbour commented Aug 29, 2019 via email •

edited

Loading