PERF: implement scalar ops blockwise #29853

jbrockmendel · 2019-11-26T02:41:19Z

Similar to #28583, but going through BlockManager.apply.

…ck-to-arith

pep8speaks · 2019-12-21T01:22:03Z

Hello @jbrockmendel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-27 16:35:39 UTC

jbrockmendel · 2019-12-21T01:27:25Z

Resolved the issue with test_expressions behaving unexpectedly.

Added an asv that times operations on a homogeneous-dtype DataFrame (rows=20k, cols=100) with a scalar. Not sure how many variants of these to do; could be easy to go overboard.

       before           after         ratio
     [0cd388fd]       [1fc1e3ec]
     <cy30>           <back-to-arith>
-         110±4ms         59.6±1ms     0.54  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function floordiv>)
-         115±3ms         59.5±1ms     0.52  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function floordiv>)
-        88.2±1ms       44.1±0.7ms     0.50  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function floordiv>)
-        91.0±3ms       44.7±0.9ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function floordiv>)
-        94.8±3ms         46.5±1ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function floordiv>)
-        92.1±2ms       44.9±0.4ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function floordiv>)
-        93.9±4ms       42.2±0.2ms     0.45  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function floordiv>)
-        94.6±2ms         41.7±1ms     0.44  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function floordiv>)
-        78.0±4ms       28.6±0.4ms     0.37  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function pow>)
-        66.9±1ms       22.5±0.6ms     0.34  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mod>)
-        67.8±2ms         22.1±1ms     0.33  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mod>)
-        70.0±3ms       22.5±0.5ms     0.32  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function pow>)
-        67.3±2ms       20.8±0.6ms     0.31  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mod>)
-        66.7±1ms         20.3±1ms     0.30  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mod>)
-        64.9±1ms       19.3±0.5ms     0.30  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mod>)
-      65.5±0.7ms       19.1±0.8ms     0.29  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mod>)
-        73.2±1ms         18.2±1ms     0.25  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mod>)
-        74.7±3ms         18.3±1ms     0.25  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mod>)
-        60.3±1ms       9.87±0.1ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function pow>)
-        60.0±2ms       9.80±0.3ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function pow>)
-        59.4±3ms       9.65±0.4ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function pow>)
-        58.5±2ms       9.09±0.3ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function pow>)
-      42.6±0.9ms      3.96±0.09ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function xor>)
-      51.3±0.4ms       4.70±0.1ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function pow>)
-        52.7±2ms       4.68±0.1ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function pow>)
-      42.3±0.8ms       3.54±0.1ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function or_>)
-        43.4±2ms       3.36±0.2ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function and_>)
-        43.0±1ms       3.23±0.2ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function xor>)
-        44.9±1ms       3.36±0.2ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function or_>)
-      45.6±0.4ms       3.41±0.1ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function and_>)
-        49.8±1ms       3.24±0.1ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mul>)
-      28.0±0.5ms       1.81±0.2ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function eq>)
-      48.8±0.7ms      3.15±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function truediv>)
-      50.1±0.5ms      3.15±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mul>)
-      50.1±0.8ms       3.14±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function truediv>)
-        51.4±1ms       3.21±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function truediv>)
-      49.6±0.3ms      3.10±0.05ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function sub>)
-      49.5±0.9ms      3.08±0.06ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mul>)
-        51.3±1ms       3.17±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function add>)
-        50.0±1ms      3.07±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function add>)
-      28.7±0.8ms       1.77±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function gt>)
-      50.4±0.8ms       3.06±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function add>)
-        29.1±1ms      1.76±0.03ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ne>)
-        50.1±1ms       3.02±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function truediv>)
-      49.8±0.7ms      2.99±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function truediv>)
-        50.4±1ms       3.02±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function truediv>)
-        50.9±1ms       3.04±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mul>)
-        50.8±1ms      3.03±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function truediv>)
-        50.8±1ms       3.03±0.3ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function sub>)
-      52.2±0.9ms      3.08±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mul>)
-        54.0±1ms      3.17±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function add>)
-      51.1±0.7ms       2.98±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function add>)
-        50.8±1ms      2.96±0.06ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function sub>)
-        53.7±1ms      3.12±0.04ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function sub>)
-        52.7±2ms      3.05±0.08ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function truediv>)
-        51.2±1ms       2.96±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function sub>)
-        28.6±1ms      1.65±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function le>)
-      29.9±0.6ms       1.72±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ge>)
-        29.7±2ms      1.69±0.04ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function lt>)
-      54.0±0.7ms       3.06±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function sub>)
-      55.1±0.9ms       3.12±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function add>)
-      50.6±0.8ms       2.86±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function add>)
-        59.1±2ms       3.31±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function sub>)
-        29.4±1ms      1.65±0.08ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ge>)
-        53.5±1ms       2.97±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mul>)
-        30.9±2ms       1.69±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function lt>)
-        29.2±2ms      1.59±0.09ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function eq>)
-      54.9±0.9ms       2.98±0.2ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mul>)
-        54.2±1ms       2.93±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function add>)
-        54.5±1ms      2.92±0.08ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function sub>)
-        29.8±2ms       1.58±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function gt>)
-        58.2±3ms       3.08±0.2ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mul>)
-        31.2±1ms       1.61±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function le>)
-        31.0±2ms       1.59±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ne>)
-        27.8±1ms      1.29±0.06ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ge>)
-      27.4±0.8ms       1.22±0.1ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function le>)
-        28.0±1ms      1.17±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function le>)
-      27.8±0.6ms       1.16±0.1ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function lt>)
-      27.6±0.9ms      1.15±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function lt>)
-      27.8±0.4ms      1.16±0.07ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function le>)
-        29.8±2ms      1.24±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ne>)
-      27.7±0.8ms      1.15±0.07ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function lt>)
-      27.4±0.6ms      1.14±0.08ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function le>)
-      28.5±0.8ms      1.18±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ne>)
-      27.8±0.7ms      1.15±0.01ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function le>)
-      28.5±0.6ms      1.16±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function gt>)
-      27.1±0.8ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function lt>)
-      27.5±0.6ms      1.12±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function gt>)
-      27.4±0.4ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ne>)
-      27.1±0.9ms      1.10±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ge>)
-      28.6±0.4ms      1.14±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function eq>)
-      27.4±0.6ms      1.09±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function eq>)
-      28.2±0.7ms      1.12±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ne>)
-      27.3±0.6ms      1.08±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ge>)
-        28.1±1ms      1.12±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function gt>)
-        28.3±1ms      1.12±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function gt>)
-      28.1±0.8ms      1.11±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function lt>)
-      27.1±0.8ms      1.07±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function le>)
-        27.9±1ms      1.10±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ge>)
-      28.2±0.5ms      1.09±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function lt>)
-      29.0±0.5ms      1.11±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ne>)
-        29.0±2ms      1.11±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function gt>)
-      29.2±0.2ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function eq>)
-        28.2±2ms      1.07±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ge>)
-        29.1±1ms      1.10±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function gt>)
-      29.5±0.2ms      1.11±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function eq>)
-        30.2±2ms      1.13±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ge>)
-      28.4±0.2ms      1.05±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function eq>)
-        29.9±1ms      1.09±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ne>)
-      29.1±0.6ms      1.03±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function eq>)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

jreback

looks good. need to go thru again, some minor commments.

jreback · 2019-12-24T14:31:49Z

pandas/core/ops/__init__.py

@@ -372,6 +373,10 @@ def dispatch_to_series(left, right, func, str_rep=None, axis=None):
    right = lib.item_from_zerodim(right)
    if lib.is_scalar(right) or np.ndim(right) == 0:

+        array_op = get_array_op(func, str_rep=str_rep)


can you add a comment here on what is going on

jreback · 2019-12-24T14:32:27Z

pandas/core/internals/managers.py

@@ -411,7 +411,10 @@ def apply(self, f: str, filter=None, **kwargs):
                    axis = obj._info_axis_number
                    kwargs[k] = obj.reindex(b_items, axis=axis, copy=align_copy)

-            applied = getattr(b, f)(**kwargs)
+            if callable(f):


is this strictly necessary? meaning happy to require only callables here (would require some changing)

all of our existing usages pass strings here to get at Block methods. i think @WillAyd had a suggestion about re-working Block.apply to do str vs callable handling there; that should be its own PR

k, yeah this whole section could use some TLC

jreback · 2019-12-24T14:32:42Z

pandas/core/ops/array_ops.py

@@ -367,3 +368,13 @@ def fill_bool(x, left=None):
        res_values = filler(res_values)  # type: ignore

    return res_values
+
+
+def get_array_op(op, str_rep=None):


can you add a doc-string / what this is doing

…ck-to-arith

jreback · 2019-12-27T16:25:07Z

pandas/core/internals/blocks.py

+                block = self.make_block(values=nv, placement=[loc])
+                nbs.append(block)
+            return nbs
+


could be an elif here and re-assign to result, just to make the flow more natural. alt could make this into a method on BM. but for followon's

jreback · 2019-12-27T16:26:36Z

pandas/core/ops/__init__.py

+        array_op = get_array_op(func, str_rep=str_rep)
+        bm = left._data.apply(array_op, right=right)
+        return type(left)(bm)
+


this could just be an if (as you are returning), e.g. change the following elif to an if, but NBD

jreback · 2019-12-27T16:27:23Z

ok a couple of minor comments, rebase and looks ok to go. I suspect you will be refactoring things after this is in anyhow.

…ck-to-arith

jbrockmendel · 2019-12-27T18:49:53Z

rebased+green

jreback · 2019-12-27T19:28:57Z

thanks @jbrockmendel

jbrockmendel · 2019-12-27T20:33:57Z

Alright! This was a tough slog, thanks to all who helped along the way. Next up: dispatching for op(frame, series).

…ndexing-1row-df * upstream/master: (333 commits) CI: troubleshoot Web_and_Docs failing (pandas-dev#30534) WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525) DEPR: camelCase in offsets, get_offset (pandas-dev#30340) PERF: implement scalar ops blockwise (pandas-dev#29853) DEPR: Remove Series.compress (pandas-dev#30514) ENH: Add numba engine for rolling apply (pandas-dev#30151) [ENH] Add to_markdown method (pandas-dev#30350) DEPR: Deprecate pandas.np module (pandas-dev#30386) ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405) BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491) CI: Fix GBQ Tests (pandas-dev#30478) Bug groupby quantile listlike q and int columns (pandas-dev#30485) ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402) TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398) BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335) Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502) BUG: preserve EA dtype in transpose (pandas-dev#30091) BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498) REF/TST: method-specific files for test_append (pandas-dev#30503) marked unused parameters (pandas-dev#30504) ...

huitrouge · 2020-03-18T07:41:37Z

Next up: dispatching for op(frame, series).

Hi @jbrockmendel! is there already an issue regarding this where we could track the progress?

jbrockmendel · 2020-03-18T15:36:29Z

is there already an issue regarding this where we could track the progress?

no, but i can tell you the answer.

The four cases are: scalar, series(axis=0), series(axis=1), and frame. This PR handled the scalar case. Another PR handled the series(axis=1) case (except when that series is EA-backed). #32779 handles the frame case.

huitrouge · 2020-03-19T08:40:23Z

Thank you for your reply!

Another PR handled the series(axis=1) case (except when that series is EA-backed).

by "handled" you mean, that this has already been resolved? If so in which pandas-Version?

We are still seeing performance issues in
"df - series" - cases.

e.g.

import pandas as pd

df = pd.DataFrame(index=['A'], columns=range(1000), data=1.0)
s = pd.Series(index=df.columns, data=1.0)

%timeit x = df - s

is much slower on 0.24.x all the way through to the pypy-available 1.0.3 than it was on 0.23.4

regards,
Malte

jbrockmendel · 2020-03-19T15:51:08Z

by "handled" you mean, that this has already been resolved? If so in which pandas-Version?

Yes, handled as in resolved. Not sure off the top of my head when that was. Before my caffeine I'd guess 60/40 that it made it into 1.0

We are still seeing performance issues in "df - series" - cases.

That case hasn't been addressed yet, will be next up after #32779. If you'd like to make a PR and improve it before I do, go for it.

backbord · 2020-03-28T09:25:00Z

@jbrockmendel, here is a performance comparison of the different pandas versions using @huitrouge's benchmark:

version  %timeit output
0.23.4:  241 µs ± 2.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
0.24.2:  218 ms ± 3.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
0.25.0:  217 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.0:   215 ms ± 2.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.2:   218 ms ± 3.11 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.3:   216 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

hope this helps.

jbrockmendel added 3 commits November 25, 2019 18:40

REF: implement scalar ops blockwise

15f0caa

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

08a43f0

…ck-to-arith

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

a765069

…ck-to-arith

jbrockmendel added Numeric Operations Arithmetic, Comparison, and Logical operations Performance Memory or execution speed performance labels Dec 3, 2019

jbrockmendel added 7 commits December 8, 2019 11:18

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

c81ea13

…ck-to-arith

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

c2f6129

…ck-to-arith

fix missing name

016ae64

revert

4536097

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

798ce75

…ck-to-arith

Fix numexpr tests

1fc1e3e

ADD asv

657d1bb

remove commented-out

66d34c2

jbrockmendel changed the title ~~WIP: implement scalar ops blockwise~~ PERF: implement scalar ops blockwise Dec 21, 2019

Whatsnew

0f26775

jbrockmendel mentioned this pull request Dec 21, 2019

0.24.0 vs 0.23.4: scalar + DataFrame is 3000x slower #24990

Closed

jbrockmendel added 3 commits December 20, 2019 18:11

blackify

a0e4adc

isort fixup

23d5c48

remoe asv params that fail in ci

2228f5e

jreback reviewed Dec 24, 2019

View reviewed changes

jbrockmendel added 3 commits December 24, 2019 09:19

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

e230cea

…ck-to-arith

comment+docstring

2f80502

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

31607c0

…ck-to-arith

jreback reviewed Dec 27, 2019

View reviewed changes

jreback added this to the 1.0 milestone Dec 27, 2019

Merge branch 'master' of https://github.com/pandas-dev/pandas into ba…

0ec7e74

…ck-to-arith

remove unreacahble

cf94d13

jreback merged commit 23a4a51 into pandas-dev:master Dec 27, 2019

jbrockmendel deleted the back-to-arith branch December 27, 2019 20:34

AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019

PERF: implement scalar ops blockwise (pandas-dev#29853)

e98f9b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: implement scalar ops blockwise #29853

PERF: implement scalar ops blockwise #29853

jbrockmendel commented Nov 26, 2019

pep8speaks commented Dec 21, 2019 •

edited

Loading

jbrockmendel commented Dec 21, 2019

jreback left a comment

jreback Dec 24, 2019

jreback Dec 24, 2019

jbrockmendel Dec 24, 2019

jreback Dec 27, 2019

jreback Dec 24, 2019

jreback Dec 27, 2019

jreback Dec 27, 2019

jreback commented Dec 27, 2019

jbrockmendel commented Dec 27, 2019

jreback commented Dec 27, 2019

jbrockmendel commented Dec 27, 2019

huitrouge commented Mar 18, 2020

jbrockmendel commented Mar 18, 2020

huitrouge commented Mar 19, 2020

jbrockmendel commented Mar 19, 2020

backbord commented Mar 28, 2020

PERF: implement scalar ops blockwise #29853

PERF: implement scalar ops blockwise #29853

Conversation

jbrockmendel commented Nov 26, 2019

pep8speaks commented Dec 21, 2019 • edited Loading

Comment last updated at 2019-12-27 16:35:39 UTC

jbrockmendel commented Dec 21, 2019

jreback left a comment

Choose a reason for hiding this comment

jreback Dec 24, 2019

Choose a reason for hiding this comment

jreback Dec 24, 2019

Choose a reason for hiding this comment

jbrockmendel Dec 24, 2019

Choose a reason for hiding this comment

jreback Dec 27, 2019

Choose a reason for hiding this comment

jreback Dec 24, 2019

Choose a reason for hiding this comment

jreback Dec 27, 2019

Choose a reason for hiding this comment

jreback Dec 27, 2019

Choose a reason for hiding this comment

jreback commented Dec 27, 2019

jbrockmendel commented Dec 27, 2019

jreback commented Dec 27, 2019

jbrockmendel commented Dec 27, 2019

huitrouge commented Mar 18, 2020

jbrockmendel commented Mar 18, 2020

huitrouge commented Mar 19, 2020

jbrockmendel commented Mar 19, 2020

backbord commented Mar 28, 2020

pep8speaks commented Dec 21, 2019 •

edited

Loading