Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_sparse bug with fill_value specified #1375

Closed
changhiskhan opened this issue Jun 1, 2012 · 2 comments
Closed

to_sparse bug with fill_value specified #1375

changhiskhan opened this issue Jun 1, 2012 · 2 comments
Assignees
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@changhiskhan
Copy link
Contributor

originally raised on pydata mailing list:

In [50]: DataFrame({'x': [1., 1.]}).to_sparse(fill_value=0).x.mean()
Out[50]: 0.5

In [51]: DataFrame({'x': [1., 1.]}).to_sparse().x.mean()
Out[51]: 1.0

@ghost ghost assigned changhiskhan Jun 1, 2012
@grsr
Copy link

grsr commented Jun 1, 2012

I have looked quickly at the python code implementing this and it appears that in both SparseArray.mean and SparseArray.sum the nsparse variable is counting the number of non-sparse entries rather than the number of sparse entries, and this is the cause of the incorrect values returned from these methods. I think that setting nsparse = self.sp_index.length - self.sp_index.npoints in both methods should fix this issue, but I don't understand the code well enough to be sure that this is correct.

changhiskhan pushed a commit that referenced this issue Jun 1, 2012
@changhiskhan
Copy link
Contributor Author

@grsr that's pretty much what I did. Thanks for the input.
If you're looking for ways to get involved without digging too deep into the codebase, we'll soon start providing "Community" labels issues that we think are more discrete and require less staring at too much of pandas internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

2 participants