Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add width_bucket builtin #39263

Merged
merged 2 commits into from
Aug 30, 2019
Merged

Conversation

kevinbarbour
Copy link
Contributor

Implements the width_bucket() builtin function

Details on the Postgres implementation can be found here:
https://www.postgresql.org/docs/11/functions-math.html

Resolves #38855

Release note (sql change): add the width_bucket builtin function.

@kevinbarbour kevinbarbour requested a review from a team August 2, 2019 14:20
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Aug 2, 2019

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - thanks @barbourkd for the contribution.

As far as the multiple overloads thing goes, I do think you should add one for floats as well. You should be able to see that it doesn't work if you do select width_bucket(a,b,c,d) from t where a, b, c are floats and not decimals.

Could you add some logic tests that exercise the SQL builtins themselves as well? You can add them to pkg/sql/logictest/testdata/logic_test/builtin_function (follow the patterns to understand the test format) and run that test file with make testbaselogic FILES=builtin_function.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @barbourkd)


pkg/sql/sem/builtins/builtins.go, line 2374 at r1 (raw file):

				for i, v := range thresholds.Array {
					if operand.Compare(ctx, v) < 0 {

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

Either way, definitely try to test this in the logic test file I mentioned above.


pkg/sql/sem/builtins/builtins.go, line 4693 at r1 (raw file):

}

func widthBucket(operand float64, b1 float64, b2 float64, count int) int {

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.

@kevinbarbour
Copy link
Contributor Author

Thanks for the notes on this @jordanlewis

I do think you should add one for floats as well. You should be able to see that it doesn't work if you do select width_bucket(a,b,c,d) from t where a, b, c are floats and not decimals.

I'll test that out and create another overload to handle floats.

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.
Could you add some logic tests that exercise the SQL builtins themselves as well?

👍

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

In this case the datum Compare func does throw an error. Will add a check and error condition to ensure the types are consistent.

@kevinbarbour
Copy link
Contributor Author

Adding a float overload causes some new issues. Before adding the float overload this worked, as it just treated the integers as decimals:

root@127.0.0.1:65297/startrek> select width_bucket(5, 1, 10, 5);
  width_bucket
+--------------+
             3
(1 row)

After adding a new overload for floats it no longer infers the ints as decimals:

root@127.0.0.1:65313/startrek> select width_bucket(5, 1, 10, 5);
pq: unknown signature: width_bucket(int, int, int, int)

That could be resolved by adding an Integer overload but then we are unable to do this:

root@127.0.0.1:65313/startrek> select width_bucket(5, 1.1, 10.0, 5);
pq: unknown signature: width_bucket(int, decimal, decimal, int)

Any thoughts on the best way to handle this?

Copy link
Contributor Author

@kevinbarbour kevinbarbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/sem/builtins/builtins.go, line 2374 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This seems sketchy - does it actually work if the operand has a different type from the element type of thresholds?

Either way, definitely try to test this in the logic test file I mentioned above.

Done - added error condition and a check in the logic tests.


pkg/sql/sem/builtins/builtins.go, line 4693 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Please put a docstring comment describing what this does. It should say something like widthBucket bars the foo.

Done.

Copy link
Contributor Author

@kevinbarbour kevinbarbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/sem/builtins/builtins.go, line 2351 at r3 (raw file):

	),

	"width_bucket": makeBuiltin(defProps(),

@jordanlewis Put in overloads for int, float, and decimal. Pretty stumped on why this doesn't allow a mix of integer and decimal literals based on how other built-ins behave. Not sure if that's a use case that is worth trying to support or not.

Using the div built-in as an example, the following div overloads exist:

div(int, int)
div(float, float)
div(decimal, decimal)

when you run div(5, 4.1) it works - it seems to just act as is the 5 is a decimal value and call the decimal overload. There is no div(int, decimal) overload.

Yet with the following width_bucket overloads:

width_bucket(int, int, int, int)
width_bucket(float, float, float, int)
width_bucket(decimal, decimal, decimal, int)

when you run width_bucket(1, 1.0, 10.0, 5) it throws this:
unknown signature: width_bucket(int, decimal, decimal, int)

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @barbourkd and @jordanlewis)


pkg/sql/sem/builtins/builtins.go, line 2351 at r3 (raw file):

Previously, barbourkd (Kevin) wrote…

@jordanlewis Put in overloads for int, float, and decimal. Pretty stumped on why this doesn't allow a mix of integer and decimal literals based on how other built-ins behave. Not sure if that's a use case that is worth trying to support or not.

Using the div built-in as an example, the following div overloads exist:

div(int, int)
div(float, float)
div(decimal, decimal)

when you run div(5, 4.1) it works - it seems to just act as is the 5 is a decimal value and call the decimal overload. There is no div(int, decimal) overload.

Yet with the following width_bucket overloads:

width_bucket(int, int, int, int)
width_bucket(float, float, float, int)
width_bucket(decimal, decimal, decimal, int)

when you run width_bucket(1, 1.0, 10.0, 5) it throws this:
unknown signature: width_bucket(int, decimal, decimal, int)

Hm, agreed that this is fairly suspicious. I would think that you'd only need one for float and one for decimal. I'll give this a closer look in a bit.

@kevinbarbour
Copy link
Contributor Author

@jordanlewis planning to spend a little more time digging into this today - you had a chance to take a look at it or have any thoughts on where I should focus trying to figure out what's going on?

@jordanlewis
Copy link
Member

I unfortunately haven't had any time to try to dig into this. My guess is that there's a deeper issue with overload resolution with mixed type arguments. I recall there's some logic that acts differently depending on whether the overload has fully homogeneous types, which isn't the case here.

If you want you could try to look into that. I'll warn you that that code is fairly tricky and you'd probably need to add a bunch of prints or use a debugger to understand what's going on in this case.

In the end, it might not be necessary to have a float overload at all - it's a nice to have - but it's a bummer the overload stuff is getting in our way here.

@kevinbarbour
Copy link
Contributor Author

You weren't lying about this overload code being tricky - took me a little bit, but eventually tracked down the path it's taking through the overload resolution logic. Your guess was correct - after a certain point the overload resolution code just tries to make all the parameters homogenous and in this case they are not. I assume that we're a little outside the scope of the issue at this point, but here's what I figured out:

Say that we make it into the overload heuristics with a signature of width_bucket(decimal, int, int, int) and possible overloads of width_bucket(decimal, decimal, decimal, int) and width_bucket(float, float, float, int). It then runs through the first 3 heuristics trying to eliminate all but one signature and fails - since they are both equally valid. We get to the fourth heuristic and still have both signatures as possibilities.

// The fourth heuristic is to prefer candidates that accepts the "best"
// mutual type in the resolvable type set of all constants.

Basically at this point it determines the "best mutual type" between all of the parameters and then eliminates any overloads that won't accept homogenous parameters of that type. In our case the "best common type" of (decimal, int, int, int) is Decimal. It then eliminates both of the width_bucket signatures because neither of them can accept (decimal, decimal, decimal, decimal) parameters - leaving us with no possible overloads.

@Gurio
Copy link
Contributor

Gurio commented Aug 22, 2019

@jordanlewis could you approve/merge?

@jordanlewis
Copy link
Member

Sorry about the delay here. Let's merge the version without the float support. We can do better here later.

@kevinbarbour
Copy link
Contributor Author

kevinbarbour commented Aug 29, 2019 via email

Implements the width_bucket() builtin function

Details on the Postgres implementation can be found here:
https://www.postgresql.org/docs/11/functions-math.html

Resolves cockroachdb#38855

Release note (sql change): add the width_bucket builtin function.
@kevinbarbour
Copy link
Contributor Author

@jordanlewis Float overload has been removed.

@jordanlewis
Copy link
Member

Thanks - looks like you have a merge conflict. If you could fix that that would be great. Also, I know you probably have gotten busy with other things - if you're out of bandwidth, feel free to check the "maintainers can push" box on this PR and I can fix up the remaining small issues.

@kevinbarbour
Copy link
Contributor Author

Conflict resolved.

@kevinbarbour
Copy link
Contributor Author

@jordanlewis Not sure what's going on with the failed build. Build is successful on my local machine.

@jordanlewis
Copy link
Member

Okay, thanks @barbourkd!

bors r+

craig bot pushed a commit that referenced this pull request Aug 30, 2019
39263: sql: add width_bucket builtin r=jordanlewis a=barbourkd

Implements the width_bucket() builtin function

Details on the Postgres implementation can be found here:
https://www.postgresql.org/docs/11/functions-math.html

Resolves #38855

Release note (sql change): add the width_bucket builtin function.

Co-authored-by: Kevin Barbour <barbourkd@vcu.edu>
Co-authored-by: Kevin <kevinbarbourd@gmail.com>
@craig
Copy link
Contributor

craig bot commented Aug 30, 2019

Build succeeded

@craig craig bot merged commit 3ff89d5 into cockroachdb:master Aug 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

builtins: add width_bucket builtin function
5 participants