Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

supern8ent · 2016-05-20T21:53:55Z

In writing some math code in pandas, I find it necessary to do things like

df2 = df.sub(ser, axis='columns')

instead of the shorter and more intuitive

df2 = df - ser

in order to control the axis along which the series is broadcast.

I think it would be a big improvement syntactically if pandas would automatically broadcast down the axis that didn't have a matching name.

Example:

df = pd.DataFrame(np.random.rand(3, 2), columns=['a', 'b'])
df.index.name = 'dim0'
df.columns.name = 'dim1'
df
dim1         a         b
dim0                    
0     0.755744  0.321682
1     0.915464  0.413154
2     0.647672  0.457927

subtract:

df - df['a']    # does not give the desired result
       a   b   0   1   2
dim0                    
0    NaN NaN NaN NaN NaN
1    NaN NaN NaN NaN NaN
2    NaN NaN NaN NaN NaN

subtract, specifying which axis to match on (broadcasting happens on the other axis):

df.sub(df['a'], axis='index')   # gives the desired result
dim1    a         b
dim0               
0     0.0 -0.434062
1     0.0 -0.502310
2     0.0 -0.189745

I am suggesting that the "-" operator would look at the names of the indices in the operands and match on the axis that has the same name in the two operands.

By way of motivation, I'm doing mass spectral matching of compounds, so I could name my indices 'chemical' and 'mass'.

The text was updated successfully, but these errors were encountered:

max-sixty · 2016-05-20T22:21:32Z

The concept resonates. It's exactly how xarray works - check that out if you want labelled dimensions fully supported.

For this to work in pandas, my POV is named indexes / dimensions need to be supported throughout operations. If we only implemented it for this case, it could feel a bit too magical.

ref :#11373 & #4036

jreback · 2016-05-24T20:06:27Z

xref to #8162 .

so going to rerpose that as a master issue instead of having all of the issues we will never find open.

sinhrks added Indexing Related to indexing on series/frames, not to indexes themselves API Design labels May 21, 2016

jreback closed this as completed May 24, 2016

makmanalp mentioned this issue May 24, 2016

Allowing the index to be referenced by name, like a column #8162

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

supern8ent commented May 20, 2016

max-sixty commented May 20, 2016 •

edited

Loading

jreback commented May 24, 2016

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

Comments

supern8ent commented May 20, 2016

max-sixty commented May 20, 2016 • edited Loading

jreback commented May 24, 2016

max-sixty commented May 20, 2016 •

edited

Loading