You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently I see two points in the Modin frame where we compute either all partition lengths for the first column of partitions, or all column widths for the first row of partitions:
The first line, in _copartition, recently caused single-threaded execution for a frame with partitions of Decimal objects. Each frame had a transpose on the queue. Executing a multiply then caused the widths to be computed serially, so each partition's call queue was drained in sequence. The result was that serially, each partition slowly put the transpose result in the object store. (The objects took a while to put in the object store because Decimal data is slow to serialize.) However, in this case the lengths and widths were cache, so there was no need to compute lengths and widths at all.
Attached is a ray timeline and here is an image of the single threaded execution for the transpose in the middle (from a similar script).
It turns out that in the case of 1) reindexed_base has unknown axis lengths because we might have to add elements along the axis to align with the other frame e.g. for
the new reindexed_base has length 2 ([0, 'b']) instead of 1
To fix the single-threadedness there we will need #4494. I don't see an easy fix. We could maybe some extra code for the case where we don't expect the union with the other frames' indices to change the partition sizes.
For 2) I think we really can use self._column_widths and self._row_lengths. I will make a PR for that.
System information
modin.__version__
): 0f70e82Describe the problem
Currently I see two points in the Modin frame where we compute either all partition lengths for the first column of partitions, or all column widths for the first row of partitions:
modin/modin/core/dataframe/pandas/dataframe/dataframe.py
Line 2380 in c736def
modin/modin/core/dataframe/pandas/dataframe/dataframe.py
Line 2152 in c736def
The first line, in
_copartition
, recently caused single-threaded execution for a frame with partitions ofDecimal
objects. Each frame had a transpose on the queue. Executing a multiply then caused the widths to be computed serially, so each partition's call queue was drained in sequence. The result was that serially, each partition slowly put the transpose result in the object store. (The objects took a while to put in the object store becauseDecimal
data is slow to serialize.) However, in this case the lengths and widths were cache, so there was no need to compute lengths and widths at all.Attached is a ray timeline and here is an image of the single threaded execution for the transpose in the middle (from a similar script).
Reproduction script
The text was updated successfully, but these errors were encountered: