Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-NDFFrame object error using pandas.SparseSeries.from_coo() function #10818

Closed
francescoferroni opened this issue Aug 14, 2015 · 5 comments
Closed
Labels
Bug Sparse Sparse Data Type

Comments

@francescoferroni
Copy link

There appears to be an issue with the .from_coo() sparse function. If the frame is viewed, it gives a "non-NDFFrame error". It could potentially be due to overlapping entries on the same index (this is handled in sparse.coo_matrix by adding the entries by default).

More details here:
http://stackoverflow.com/questions/31970070/non-ndfframe-object-error-using-pandas-sparseseries-from-coo-function

Example:

import pandas as pd 
import scipy.sparse as ss 
import numpy as np 
row = (np.random.random(100)*100).astype(int) 
col = (np.random.random(100)*100).astype(int) 
val = np.random.random(100)*100 
sparse = ss.coo_matrix((val,(row,col)),shape=(100,100)) 
pd.SparseSeries.from_coo(sparse)
TypeError: cannot concatenate a non-NDFrame object
@jreback jreback added Sparse Sparse Data Type Bug labels Aug 15, 2015
@jreback
Copy link
Contributor

jreback commented Aug 15, 2015

looks like a bug. needs someone to dig in on these sparse issues.

@hpaulj
Copy link

hpaulj commented Dec 10, 2015

I've deduced that it's a size issue. When the underlying Series starts to display with ellipsis, I get this error. So it has to do with the number of terms (nonzero elements in scipy sparse matrix). I did my testing in Py3 and got a different error (from the .__unicode__ branch rather than the .__bytes__ one).

It may affect any SparseSeries, regardless of whether it is constructed with from_coo, since it occurs after the to_sparse step.

s = Series(A.data, MultiIndex.from_arrays((A.row, A.col)))
s = s.sort_index()
s = s.to_sparse()  # TODO: specify kind?

@jreback jreback added this to the Next Major Release milestone Dec 10, 2015
@jreback
Copy link
Contributor

jreback commented Dec 10, 2015

The fundamental issue is that slicing is broken in sparse.

In [11]: s = Series([1]+[np.nan]*5).to_sparse()

In [12]: s
Out[12]: 
0     1
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
dtype: float64
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

In [13]: s.iloc[0:3]
Out[13]: 
[1.0, nan, nan]
Fill: nan
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

In [14]: type(s.iloc[0:3])
Out[14]: pandas.sparse.array.SparseArray

In [15]: type(s)
Out[15]: pandas.sparse.series.SparseSeries

[14] should be a SparseSeries (not SparseArray which is the underlying object the SparseSeries holds). its not getting wrapped when sliced somewhere.

@jreback
Copy link
Contributor

jreback commented Dec 10, 2015

this would be fixed by #10627

@kawochen kawochen mentioned this issue Dec 10, 2015
18 tasks
@sinhrks
Copy link
Member

sinhrks commented Apr 3, 2016

As @jreback said, this is a displaying issue of SparseSeries created with from_coo. Dupe of #10560.

s = pd.SparseSeries.from_coo(sparse)
type(s)
# pandas.sparse.series.SparseSeries

@sinhrks sinhrks closed this as completed Apr 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Sparse Sparse Data Type
Projects
None yet
Development

No branches or pull requests

4 participants