Make NA/null a first-class citizen in groupby operations #16

wesm · 2016-09-07T15:48:13Z

xref #9

Maybe we can collect a list of pandas issues that have happened in and around this.

Can't iterate a DataFrameGroupBy object if the group by key contains NaN pandas-dev/pandas#14170

I've found it's valuable to be able to consistently compute statistics including the NA values, especially with multiple group keys. I haven't kept track of how pandas handles these now in all cases, but it would be nice to come up with a strategy to make NA behave like any other group in a group by setting.

wesm · 2016-09-19T17:07:18Z

This problem also extends to other analytics, like value_counts:

In:
s = pd.Series([1, 2, np.nan, 1, 1, 2, np.nan])
s.value_counts()

Out:
1.0    3
2.0    2
dtype: int64

Here, NA should appear in the result and indicate 2 values. Same goes for groupby(...).size()

jorisvandenbossche · 2016-09-19T17:40:34Z

In the specific case of value_counts, there is the dropna keyword which does this:

In [15]: s.value_counts(dropna=False)
Out[15]: 
 1.0    3
NaN     2
 2.0    2
dtype: int64

But of course that does not dismiss the bigger problem with groupby and others (and you could also argue whether dropna=False would be a better default value ..)

chris-b1 · 2016-09-19T17:44:58Z

It's linked in the top issue, but just for visibility, pandas-dev/pandas#12607 is a WIP PR that would add the dropna keyword arg to groupby.

jreback added the missing data label Sep 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make NA/null a first-class citizen in groupby operations #16

Make NA/null a first-class citizen in groupby operations #16

wesm commented Sep 7, 2016 •

edited by jreback

Loading

wesm commented Sep 19, 2016

jorisvandenbossche commented Sep 19, 2016 •

edited

Loading

chris-b1 commented Sep 19, 2016

Make NA/null a first-class citizen in groupby operations #16

Make NA/null a first-class citizen in groupby operations #16

Comments

wesm commented Sep 7, 2016 • edited by jreback Loading

wesm commented Sep 19, 2016

jorisvandenbossche commented Sep 19, 2016 • edited Loading

chris-b1 commented Sep 19, 2016

wesm commented Sep 7, 2016 •

edited by jreback

Loading

jorisvandenbossche commented Sep 19, 2016 •

edited

Loading