-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bitwise shift operations #12888
Add bitwise shift operations #12888
Conversation
This is a duplicate PR (original one: #12886). I suggest we close this one and use the original one as we already had review comments there. |
I recommended that we do that because there was a tricky rebase issue. The other one is closed and we can address that comment here. |
Other functions in Now, all of these functions will take in Java type of long, which means after shifting it's possible that the value exceeds the range of the type. But luckily we already specify the bits we want shifted, so we could adjust the max range to match the type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM besides directly using the hex format for readability. @jessesleeping @rongrong thoughts?
presto-main/src/test/java/com/facebook/presto/operator/scalar/TestBitwiseFunctions.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments. Otherwise looks good.
} | ||
|
||
if (bits <= 1 || bits > 64) { | ||
throw new PrestoException(INVALID_FUNCTION_ARGUMENT, "Bits specified in bit_count must be between 2 and 64, got " + bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message use wrong function name (bit_count
). Please fix it.
One question, why do we want to block the shifting call when bits == 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In previous bitwise functions, specifically bit_count, the input wasn't allowed to have a single bit, so I kept it like that for consistency. I can change it to simply being bits < 1
though.
presto-main/src/main/java/com/facebook/presto/operator/scalar/BitwiseFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/BitwiseFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/BitwiseFunctions.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia seems to suggest that logical shift is the one that doesn't care about sign, and arithmetic shift is the one that does:
https://en.wikipedia.org/wiki/Logical_shift
https://en.wikipedia.org/wiki/Arithmetic_shift
Is there an open issue for this? If not, should we open one to discuss about the name and parameters?
presto-main/src/main/java/com/facebook/presto/operator/scalar/BitwiseFunctions.java
Outdated
Show resolved
Hide resolved
@@ -28,4 +28,27 @@ Bitwise Functions | |||
|
|||
Returns the bitwise XOR of ``x`` and ``y`` in 2's complement representation. | |||
|
|||
.. function:: bitwise_shift_left_logical(x, shift, bits) -> bigint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally found the bits
parameter really confusing. If x
is a smallint
, am I supposed to know how many bits a smallint
should have and provide the right value for bits
to get a correct result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bits convention is already a established with bit_count
: https://prestodb.github.io/docs/current/functions/bitwise.html
I think, while confusing, it should be consistent; or, we deprecate the bits
convention. But for now I'd stick with consistency.
Personally I'd like the names to be |
Shift left logical operation on ``x`` (treated as ``bits``-bit signed integer) | ||
shifted by ``shift`` in 2's complement representation. | ||
|
||
SELECT bitwise_shift_left_logical(7, 2, 4); -- 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 is a 5-bit signed number. This doesn't make sense to me. if 12 is a 4-bit signed number, then we cannot represent a 64-bit signed number with bigint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not necessarily that operation doesn't make sense, maybe just the description doesn't make sense).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, for logical left shift, you probably will always fill 0, so bits
is how many bits to keep after shift (where to truncate). So describing the number as n-bit signed just adds confusion.
For logical right shift, what you want to describe is what to fill with (0 or 1), so maybe defining the sign-bit makes sense. Though I think it makes more sense to say how many bits I want to operate on, and whether I'd fill it with 0 or 1 (logical_right_shift(number, shift, bits, 0/1)
) Describing logical shift with signs is just very confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If right shift is also going to fill with 0, then we should not describe the function using "signed number" at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By definition, logical right shift will always fill in 0s, so it does not preserve the sign of the input. Meanwhile, arithmetic right shift will fill with the MSB of the input. This operation will preserve the sign of the input. Therefore, I think we should just stick to these default conventions. I feel like adding the 0/1
field as a 4th input would just make it more confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did remove the "signed number" part in the description though, I can see how that can be confusing.
Related to #4028 |
I think they should include https://prestodb.github.io/docs/current/functions/bitwise.html There would be no difference between arithmetic and logic left shift. So we're left with:
i.e. logical/arithmetic is now in front rather than back of the name |
Allows bitwise shift operations including arithmetic shift right, logical shift right, and shift left.
I'm wondering, why are we going with functions rather than operator? Ideally, there should be |
I guess it's because SQL syntax only has basic numeric operators specified (+, -, ...). For those not in SQL spec, it could be safer to treat them as functions for possible future compliance.
|
Dug around a little bit and Postgres support |
Allows bitwise shift operations including arithmetic shift right, logical shift right, and shift left.