-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat in order to check properly the limit size #23217
Conversation
…perly the limit size
cc @cloud-fan |
if (key == null) { | ||
throw new RuntimeException("Cannot use null as map key.") | ||
} | ||
|
||
val index = keyToIndex.getOrDefault(key, -1) | ||
if (index == -1) { | ||
if (withSizeCheck && size >= ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should always check the size. Such a big map is very likely to cause problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this flag is just for perf reasons, we can skip the check in some conditions and I didn't want to introduce perf overhead if not needed. If we remove the flag we would do the comparison for each item, also when it is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, I'd like to avoid premature optimization. Actually how much perf this can save? This code block is already doing some heavy work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let me remove it then, thanks.
if (key == null) { | ||
throw new RuntimeException("Cannot use null as map key.") | ||
} | ||
|
||
val index = keyToIndex.getOrDefault(key, -1) | ||
if (index == -1) { | ||
if (withSizeCheck && size >= ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) { | ||
throw new RuntimeException(s"Unsuccessful attempt to concat maps with $size elements " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: concat
-> build
thanks for the cleanup! |
Test build #99664 has finished for PR 23217 at commit
|
Test build #99670 has finished for PR 23217 at commit
|
retest this please |
Test build #99668 has finished for PR 23217 at commit
|
Test build #99680 has finished for PR 23217 at commit
|
thanks, merging to master! |
…perly the limit size ## What changes were proposed in this pull request? The PR starts from the [comment](apache#23124 (comment)) in the main one and it aims at: - simplifying the code for `MapConcat`; - be more precise in checking the limit size. ## How was this patch tested? existing tests Closes apache#23217 from mgaido91/SPARK-25829_followup. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The PR starts from the comment in the main one and it aims at:
MapConcat
;How was this patch tested?
existing tests