-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32018][SQL][FollowUp][3.0] Throw exception on decimal value overflow of sum aggregation #29404
Conversation
Test build #127309 has finished for PR 29404 at commit
|
Test build #127311 has finished for PR 29404 at commit
|
Test build #127316 has finished for PR 29404 at commit
|
Test build #127313 has finished for PR 29404 at commit
|
This adds perf overhead as we need to check overflow after each Add operation, while the master branch only checks overflow at the end because we have an extra agg buffer slot. I think this perf overhead is necessary to avoid this correctness bug in 3.0/2.4, but I'm open to other opinions. cc @skambha @dongjoon-hyun @viirya @maropu |
Is this saying the |
And looks like we don't have many choices. I think correctness should be considered first. |
I think so. 3.0 doesn't even have mechanisms to detect incompatible state store format, and it may be too much work to backport the bug fix and the streaming state store checks. |
I have the same opnion with @viirya, and I think the overhead is inevitable. |
Thank you for pinging me, @gengliangwang and @cloud-fan . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
thanks, merging to 3.0! |
…erflow of sum aggregation ### What changes were proposed in this pull request? This is a followup of #29125 In branch 3.0: 1. for hash aggregation, before #29125 there will be a runtime exception on decimal overflow of sum aggregation; after #29125, there could be a wrong result. 2. for sort aggregation, with/without #29125, there could be a wrong result on decimal overflow. While in master branch(the future 3.1 release), the problem doesn't exist since in #27627 there is a flag for marking whether overflow happens in aggregation buffer. However, the aggregation buffer is written in steaming checkpoints. Thus, we can't change to aggregation buffer to resolve the issue. As there is no easy solution for returning null/throwing exception regarding `spark.sql.ansi.enabled` on overflow in branch 3.0, we have to make a choice here: always throw exception on decimal value overflow of sum aggregation. ### Why are the changes needed? Avoid returning wrong result in decimal value sum aggregation. ### Does this PR introduce _any_ user-facing change? Yes, there is always exception on decimal value overflow of sum aggregation, instead of a possible wrong result. ### How was this patch tested? Unit test case Closes #29404 from gengliangwang/fixSum. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Thank you all! |
It seems this commit caused the valid test failure in
I've checked that the test failed w/this commit and passed w/o it in my local env. |
ah it's branch-3.0 so the github action doesn't count. @HyukjinKwon how hard it is to have github action in branch-3.0? |
### What changes were proposed in this pull request? Revert SPARK-32018 related changes in branch 3.0: #29125 and #29404 ### Why are the changes needed? #29404 is made to fix correctness regression introduced by #29125. However, the behavior of decimal overflow is strange in non-ansi mode: 1. from 3.0.0 to 3.0.1: decimal overflow will throw exceptions instead of returning null on decimal overflow 2. from 3.0.1 to 3.1.0: decimal overflow will return null instead of throwing exceptions. So, this PR proposes to revert both #29404 and #29125. So that Spark will return null on decimal overflow in Spark 3.0.0 and Spark 3.0.1. ### Does this PR introduce _any_ user-facing change? Yes, Spark will return null on decimal overflow in Spark 3.0.1. ### How was this patch tested? Unit tests Closes #29450 from gengliangwang/revertDecimalOverflow. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan, I will port it back to other branches. I think it's doable. |
What changes were proposed in this pull request?
This is a followup of #29125
In branch 3.0:
While in master branch(the future 3.1 release), the problem doesn't exist since in #27627 there is a flag for marking whether overflow happens in aggregation buffer. However, the aggregation buffer is written in steaming checkpoints. Thus, we can't change to aggregation buffer to resolve the issue.
As there is no easy solution for returning null/throwing exception regarding
spark.sql.ansi.enabled
on overflow in branch 3.0, we have to make a choice here: always throw exception on decimal value overflow of sum aggregation.Why are the changes needed?
Avoid returning wrong result in decimal value sum aggregation.
Does this PR introduce any user-facing change?
Yes, there is always exception on decimal value overflow of sum aggregation, instead of a possible wrong result.
How was this patch tested?
Unit test case