Cast int to decimal has an unexpected result #5793

jayzhan211 · 2024-05-22T02:13:04Z

Describe the bug

To Reproduce

#[test]
    fn test_int_to_decimal() {
        let array = Int64Array::from(vec![5, 6, 7, 8, 9]);
        let decimal_array = cast(&array, &DataType::Decimal128(10, 4)).unwrap();
        println!("{:?}", array);
        println!("{:?}", decimal_array);
        let decimal_array = cast(&array, &DataType::Decimal128(10, -4)).unwrap();
        println!("{:?}", decimal_array);
    }

---- cast::tests::test_int_to_decimal stdout ----
PrimitiveArray<Int64>
[
  5,
  6,
  7,
  8,
  9,
]
PrimitiveArray<Decimal128(10, 4)>
[
  50000,
  60000,
  70000,
  80000,
  90000,
]
PrimitiveArray<Decimal128(10, -4)>
[
  0,
  0,
  0,
  0,
  0,
]

Expected behavior

I think we should get

PrimitiveArray<Decimal128(10, 4)>
[
  5,
  6,
  7,
  8,
  9,
]

PrimitiveArray<Decimal128(10, -4)>
[
  50000,
  60000,
  70000,
  80000,
  90000,
]

Additional context

https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Decimal128

The text was updated successfully, but these errors were encountered:

jayzhan211 · 2024-05-22T02:31:26Z

I notice there is value_as_string, so I think Decimal128(10, 4) is correct, but I'm not sure if 0 for Decimal128(10, -4) is expected or not, I expect to get 50000.

"5.0000"
"00000"

tustvold · 2024-05-27T14:53:57Z

There is a good chance the conversion does not work correctly for negative scales

ByteBaker · 2024-08-19T09:13:44Z

Hi, I think I've found the cause. Would like to take this up for a fix.

jayzhan211 · 2024-08-19T11:47:13Z

Hi, I think I've found the cause. Would like to take this up for a fix.

Sure

ByteBaker · 2024-09-18T07:40:50Z

Hi @tustvold @jayzhan211 I've made the changes. Please review them.

alamb · 2024-09-18T20:19:45Z

I am depressed about the large review backlog in this crate. We are looking for more help from the community reviewing PRs -- see #6418 for more

wjones127 · 2024-09-19T01:08:20Z

It's not clear to me that the original behavior is bad. It seems like it is casting numerically. That is, 2 is being translated into 2.0000 for decimal(10, 4) and 0 for decimal(10, -4). Numerically, those seem to be the closest values representable in those types.

It seems like what you are suggesting is casting uses the integer as the internal integer representation and then attaches the scale blindly. So 2 would be translated to 0.0002 for decimal(10, 4) and 2000 for decimal(10, -4). That seems like more surprising behavior to me. Am I understanding your suggestion correctly?

Is there a SQL system that sets some precedent for how this work? For example, how does postgres do this?

ByteBaker · 2024-09-19T02:36:31Z

To be honest there was little reference on its behaviour, including on the internet. And naturally I'd have asked how to go about this, but then I read the documentation which said negative precision is just the padding of zeroes to the right.

jayzhan211 · 2024-09-19T02:52:23Z

The proposal is based on the arrow's document so it might not be correct / expected

https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Decimal128

Exact 128-bit width decimal value with precision and scale

precision is the total number of digits
scale is the number of digits past the decimal
For example the number 123.45 has precision 5 and scale 2.

In certain situations, scale could be negative number. For negative scale, it is the number of padding 0 to the right of the digits.

For example the number 12300 could be treated as a decimal has precision 3 and scale -2.

Postgres

postgres=# select 5::decimal(10, -4);
 numeric 
---------
       0
(1 row)

I think we might need to fix the doc

ByteBaker · 2024-09-19T03:00:55Z

I was just about to post this too. 😅

This also means that this issue can be closed, and the PR will be no longer required.

ByteBaker · 2024-09-19T03:02:11Z

Postgres documents the behavior as below:

Beginning in PostgreSQL 15, it is allowed to declare a numeric column with a negative scale. Then values will be rounded to the left of the decimal point. The precision still represents the maximum number of non-rounded digits. Thus, a column declared as

NUMERIC(2, -3)
will round values to the nearest thousand and can store values between -99000 and 99000, inclusive

Check out the docs.

alamb · 2024-09-19T16:54:10Z

@jayzhan211 if you concur let's close the ticket

alamb · 2024-09-19T16:54:18Z

Thank you for the research @ByteBaker and @wjones127

jayzhan211 · 2024-09-20T00:35:32Z

Sure, we just need to fix the document to be consistent with the postgres's one

jayzhan211 added the bug label May 22, 2024

ByteBaker mentioned this issue Sep 18, 2024

fix: bad cast to Decimal128 for negative precision #6413

Closed

jayzhan211 closed this as completed Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cast int to decimal has an unexpected result #5793

Cast int to decimal has an unexpected result #5793

jayzhan211 commented May 22, 2024

jayzhan211 commented May 22, 2024

tustvold commented May 27, 2024 •

edited

Loading

ByteBaker commented Aug 19, 2024

jayzhan211 commented Aug 19, 2024

ByteBaker commented Sep 18, 2024

alamb commented Sep 18, 2024

wjones127 commented Sep 19, 2024

ByteBaker commented Sep 19, 2024

jayzhan211 commented Sep 19, 2024 •

edited

Loading

ByteBaker commented Sep 19, 2024

ByteBaker commented Sep 19, 2024

alamb commented Sep 19, 2024

alamb commented Sep 19, 2024

jayzhan211 commented Sep 20, 2024

Cast int to decimal has an unexpected result #5793

Cast int to decimal has an unexpected result #5793

Comments

jayzhan211 commented May 22, 2024

jayzhan211 commented May 22, 2024

tustvold commented May 27, 2024 • edited Loading

ByteBaker commented Aug 19, 2024

jayzhan211 commented Aug 19, 2024

ByteBaker commented Sep 18, 2024

alamb commented Sep 18, 2024

wjones127 commented Sep 19, 2024

ByteBaker commented Sep 19, 2024

jayzhan211 commented Sep 19, 2024 • edited Loading

ByteBaker commented Sep 19, 2024

ByteBaker commented Sep 19, 2024

alamb commented Sep 19, 2024

alamb commented Sep 19, 2024

jayzhan211 commented Sep 20, 2024

tustvold commented May 27, 2024 •

edited

Loading

jayzhan211 commented Sep 19, 2024 •

edited

Loading