Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrameGroupBy.cumcount() returing Series instead of DataFrame #5608

Closed
jorisvandenbossche opened this issue Nov 28, 2013 · 4 comments
Closed

Comments

@jorisvandenbossche
Copy link
Member

@hayd

DataFrameGroupBy.cumcount() returns a Series while most others (the ones I tested, like count, cumsum, first, sum ) return a DataFrame. They are maybe not all comparable, but at least with cumsum it could be I think:

In [1]: data = pd.DataFrame({'name' : ['a', 'a', 'b', 'd'], 'counts' : [3,4,3,2]})
In [2]: data
Out[2]:
   counts name
0       3    a
1       4    a
2       3    b
3       2    d

In [3]: g = data.groupby('name')
In [4]: g.cumcount()
Out[4]:
0    0
1    1
2    0
3    0
dtype: int64

In [5]: g.count()
Out[5]:
      counts  name
name
a          2     2
b          1     1
d          1     1

In [6]: g.first()
Out[6]:
      counts
name
a          3
b          3
d          2

In [7]: g.sum()
Out[7]:
      counts
name
a          7
b          3
d          2

In [8]: g.cumsum()
Out[8]:
  counts name
0      3    a
1      7   aa
2      3    b
3      2    d
@hayd
Copy link
Contributor

hayd commented Nov 28, 2013

I disagree, cumcount is not "looking" at the columns.

  • It makes sense for cumsum, sum, etc. since they are summing each column.
  • Similarly for first, it's the first "row" (without the as_index)
  • count with the count in each column (note it excludes NaN).

@jorisvandenbossche
Copy link
Member Author

Ah, yes. Indeed, that way it is logical!
And also g.size() does return a Series, which is similar (and because of the g.count being equal for all columns, I didn't thought of NaNs influencing the result).

@hayd
Copy link
Contributor

hayd commented Nov 29, 2013

Thanks for reporting anyway, made me noticed some low-hanging fruit here #5614 (I think cumsum etc.? is a bit wrong)

@jorisvandenbossche
Copy link
Member Author

Yes indeed, I wanted to bring that up also, but saw that you already did it :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants