Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

Closed
supern8ent opened this issue May 20, 2016 · 2 comments
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@supern8ent
Copy link

In writing some math code in pandas, I find it necessary to do things like

df2 = df.sub(ser, axis='columns')

instead of the shorter and more intuitive

df2 = df - ser

in order to control the axis along which the series is broadcast.

I think it would be a big improvement syntactically if pandas would automatically broadcast down the axis that didn't have a matching name.

Example:

df = pd.DataFrame(np.random.rand(3, 2), columns=['a', 'b'])
df.index.name = 'dim0'
df.columns.name = 'dim1'
df
dim1         a         b
dim0                    
0     0.755744  0.321682
1     0.915464  0.413154
2     0.647672  0.457927

subtract:

df - df['a']    # does not give the desired result
       a   b   0   1   2
dim0                    
0    NaN NaN NaN NaN NaN
1    NaN NaN NaN NaN NaN
2    NaN NaN NaN NaN NaN

subtract, specifying which axis to match on (broadcasting happens on the other axis):

df.sub(df['a'], axis='index')   # gives the desired result
dim1    a         b
dim0               
0     0.0 -0.434062
1     0.0 -0.502310
2     0.0 -0.189745

I am suggesting that the "-" operator would look at the names of the indices in the operands and match on the axis that has the same name in the two operands.

By way of motivation, I'm doing mass spectral matching of compounds, so I could name my indices 'chemical' and 'mass'.

@max-sixty
Copy link
Contributor

max-sixty commented May 20, 2016

The concept resonates. It's exactly how xarray works - check that out if you want labelled dimensions fully supported.

For this to work in pandas, my POV is named indexes / dimensions need to be supported throughout operations. If we only implemented it for this case, it could feel a bit too magical.

ref :#11373 & #4036

@sinhrks sinhrks added Indexing Related to indexing on series/frames, not to indexes themselves API Design labels May 21, 2016
@jreback
Copy link
Contributor

jreback commented May 24, 2016

xref to #8162 .

so going to rerpose that as a master issue instead of having all of the issues we will never find open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

4 participants