Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Hive Decimal parsing cannot go above E99 or below E-99 or long values #7246

Open
revans2 opened this issue Dec 5, 2022 · 0 comments
Open
Labels
bug Something isn't working

Comments

@revans2
Copy link
Collaborator

revans2 commented Dec 5, 2022

Describe the bug
Hive wrote their own decimal parser. The problem with their implementation is that they do not allow exponents that are above 99 or smaller than -99. Our implementation in spark rapids JNI allows this and does the right thing.

It also has issues with numbers that are too large to fit in 39 digits before the exponent is applied.

Steps/Code to reproduce bug

After #7245 is merged in. Run the tests for "hive-delim-text/extended-float-values" that are against a decimal types. (they should be marked with this issue number)

1.7976931348623157E-308
1.7976931348623157e-100
1.2e-234

on the GPU results in 0.0, but on the CPU it is null because the exponent's absolute values is above 99. I think this would work for positive exponents above 99 too, but I don't know how to get a negative scale decimal value into a hive table that spark reads.

Also if the part of the number that is before the radix point is too large (I think it is over 39 digits) that too will fail and return a null on the CPU, but not on the GPU.

1111111111111111111111111111111111111111E-39

Expected behavior
For hive we parse decimal values in the same buggy way as Hive does.

Additional context
I don't think this is critical to fix. I mostly want to document this.

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 5, 2022
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants