-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#7337: Using dynamic partitionning in broadcast_apply #7338
Changes from all commits
af7ca3f
78df91d
8510dc1
d1ab6ae
2bb062f
7fd5a67
1ade9c0
0ff9097
39421e0
464055f
018e73b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -3157,9 +3157,7 @@ def dropna(self, **kwargs): | |||||
lib.no_default, | ||||||
None, | ||||||
) | ||||||
# FIXME: this is a naive workaround for this problem: https://github.com/modin-project/modin/issues/5394 | ||||||
# if there are too many partitions then all non-full-axis implementations start acting very badly. | ||||||
# The here threshold is pretty random though it works fine on simple scenarios | ||||||
# The map reduce approach works well for frames with few columnar partitions | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not correct. I got the follows results: But I didn't add any new conditions here, because this is beyond the scope of the current task. |
||||||
processable_amount_of_partitions = ( | ||||||
self._modin_frame.num_parts < CpuCount.get() * 32 | ||||||
) | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it important to keep the column names in case of empty partitions? How did you come to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do it to get the expectable result, because result columns are known already. Would it bring any problems or slowdowns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to make sure that Modin's behavior matches that of a pandas. Do we have a test for this new code?