You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1-D list and 1D np.array are treated differently, with the style of differentness dependent on shape:
>>> df == [1, 2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/ops.py", line 1815, in f
try_cast=False)
File "pandas/core/frame.py", line 4848, in _combine_const
try_cast=try_cast)
File "pandas/core/internals/managers.py", line 529, in eval
return self.apply('eval', **kwargs)
File "pandas/core/internals/managers.py", line 423, in apply
applied = getattr(b, f)(**kwargs)
File "pandas/core/internals/blocks.py", line 1437, in eval
'block values'.format(other=other))
ValueError: Invalid broadcasting comparison [[1, 2]] with block values
>>> df == np.array([1, 2])
A B
0 False False
1 False False
2 False False
>>> df == [1, 2, 3]
A B
0 False True
1 True False
2 False False
>>> df == np.array([1, 2, 3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/ops.py", line 1815, in f
try_cast=False)
File "pandas/core/frame.py", line 4848, in _combine_const
try_cast=try_cast)
File "pandas/core/internals/managers.py", line 529, in eval
return self.apply('eval', **kwargs)
File "pandas/core/internals/managers.py", line 423, in apply
applied = getattr(b, f)(**kwargs)
File "pandas/core/internals/blocks.py", line 1437, in eval
'block values'.format(other=other))
ValueError: Invalid broadcasting comparison [array([1, 2, 3])] with block values
I think the non-raising behavior is correct in both cases. If operating against a list/np.array/Index/EA and there is unique axis with the correct length, we should broadcast against the other axis.
Operating against Series sharing df.index is counter-untuitive:
>>> df + df['A']
0 1 2 A B
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
Since df['A'].index matches df.index, I think it makes much more sense for this to add df['A'] to each column and return (@shoyer IIRC you've suggested this before):
A B
0 0 1
1 4 5
2 8 9
Operations against 2D np.arrays do not behave like np.arrays (which I expected they would)
>>> df + df[['A']].values
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/ops.py", line 1734, in f
other = _align_method_FRAME(self, other, axis)
File "pandas/core/ops.py", line 1690, in _align_method_FRAME
given_shape=right.shape))
ValueError: Unable to coerce to DataFrame, shape must be (3, 2): given (3, 1)
>>> df.values + df[['A']].values
array([[0, 1],
[4, 5],
[8, 9]])
>>> df + df.iloc[:1].values
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/core/ops.py", line 1734, in f
other = _align_method_FRAME(self, other, axis)
File "pandas/core/ops.py", line 1690, in _align_method_FRAME
given_shape=right.shape))
ValueError: Unable to coerce to DataFrame, shape must be (3, 2): given (1, 2)
>>> df.values + df.iloc[:1].values
array([[0, 2],
[2, 4],
[4, 6]])
The text was updated successfully, but these errors were encountered:
At the moment some
DataFrame
arithmetic/comparison operations have unintuitive or ad-hoc broadcasting behavior.I think the non-raising behavior is correct in both cases. If operating against a list/np.array/Index/EA and there is unique axis with the correct length, we should broadcast against the other axis.
df.index
is counter-untuitive:Since
df['A'].index
matchesdf.index
, I think it makes much more sense for this to adddf['A']
to each column and return (@shoyer IIRC you've suggested this before):np.arrays
do not behave like np.arrays (which I expected they would)The text was updated successfully, but these errors were encountered: