Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Sparse master issue #10627

Closed
11 of 18 tasks
kawochen opened this issue Jul 19, 2015 · 9 comments · Fixed by #28425
Closed
11 of 18 tasks

BUG: Sparse master issue #10627

kawochen opened this issue Jul 19, 2015 · 9 comments · Fixed by #28425
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Sparse Sparse Data Type

Comments

@kawochen
Copy link
Contributor

kawochen commented Jul 19, 2015

These other issues will be fixed by this

The fundamental issue is that slicing is broken in sparse.

In [11]: s = Series([1]+[np.nan]*5).to_sparse()

In [12]: s
Out[12]: 
0     1
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
dtype: float64
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

In [13]: s.iloc[0:3]
Out[13]: 
[1.0, nan, nan]
Fill: nan
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

In [14]: type(s.iloc[0:3])
Out[14]: pandas.sparse.array.SparseArray

In [15]: type(s)
Out[15]: pandas.sparse.series.SparseSeries
[14] should be a SparseSeries (not SparseArray which is the underlying object the SparseSeries holds). its not getting wrapped when sliced somewhere.
@jreback
Copy link
Contributor

jreback commented Jul 19, 2015

not sure why this is an issue
can u show an example?

@kawochen
Copy link
Contributor Author

For Series we get back the same type:

>>> type(Series([1, 2, 3]).iloc[1:2])
<class 'pandas.core.series.Series'>
>>> type(Series([1, 2, 3]).ix[1:2])
<class 'pandas.core.series.Series'>
>>> concat([Series([1, 2, 3]).iloc[0:1], Series([1, 2, 3]).iloc[2:3]])
0    1
2    3
dtype: int64
>>> concat([SparseSeries([1, 2, 3]).iloc[0:1], SparseSeries([1, 2, 3]).iloc[2:3]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kawoc/pandas-kawochen/pandas/tools/merge.py", line 755, in concat
    copy=copy)
  File "/home/kawoc/pandas-kawochen/pandas/tools/merge.py", line 806, in __init__
    raise TypeError("cannot concatenate a non-NDFrame object")
TypeError: cannot concatenate a non-NDFrame object

@jreback
Copy link
Contributor

jreback commented Jul 19, 2015

well are the same issue then

@kawochen
Copy link
Contributor Author

hmm I suppose so, since that was the issue I was looking at. I checked the history and the difference between SparseSeries's fastpath and Series's fastpath seems deliberate (and tests do fail if I just make them the same), so I just wanted to point out SparseSeries hits paths that are not covered in tests, and we see things like SparseSeries([1,2,3]).iloc[0:3] and SparseSeries([1,2,3]).iloc[:] being different.

@kawochen
Copy link
Contributor Author

#10079 is also the same issue (construction doesn't error, .__repr__ does, but not for smaller frames)

@sinhrks sinhrks added Bug Sparse Sparse Data Type labels Jul 20, 2015
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Master Tracker High level tracker for similar issues Difficulty Intermediate labels Dec 10, 2015
@jreback jreback changed the title BUG: SparseSeries fastpath constructor buggy BUG: Sparse master issue Dec 10, 2015
@jreback jreback added this to the 0.18.0 milestone Dec 10, 2015
@jreback jreback modified the milestones: Next Major Release, 0.18.0 Jan 24, 2016
@jreback
Copy link
Contributor

jreback commented Apr 4, 2016

@gfyoung hole host of issues w.r.t. sparse, note @sinhrks has recently fixed some of these

@gfyoung
Copy link
Member

gfyoung commented Apr 4, 2016

@jreback : Maybe I'll join the party here once I "finish" with fromnumeric.py. 😄 Btw, I think scipy.sparse might be a useful resource in terms of getting ideas about functions, as they do a lot of their implementation in pure Python as well.

@jreback
Copy link
Contributor

jreback commented Apr 4, 2016

I don't think I made an issue for this, but we are open to taking sparse/array.py (and limited iinfrastructure), and making a pandas-sparse package to (that pandas would then depend). This would simplify the interface/API.

@gfyoung
Copy link
Member

gfyoung commented Dec 5, 2016

#667, #12794, #13001, and #13110 have all been closed for some time now. Should all be checked off.

@jreback jreback modified the milestones: Next Major Release, High Level Issue Tracking Sep 24, 2017
@TomAugspurger TomAugspurger removed the Master Tracker High level tracker for similar issues label Jul 6, 2018
@TomAugspurger TomAugspurger removed this from the High Level Issue Tracking milestone Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants