Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparseSeries throws a ValueError exception when printing large arrays #13144

Closed
bryanwweber opened this issue May 11, 2016 · 4 comments
Closed
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string Sparse Sparse Data Type
Milestone

Comments

@bryanwweber
Copy link

Code Sample, a copy-pastable example if possible

import scipy.sparse as sps
import pandas as pd
A = sps.rand(350, 18)
ss = pd.SparseSeries.from_coo(A)
print(ss)

A = sps.rand(350, 17)
ss = pd.SparseSeries.from_coo(A)
print(ss)

Expected Output

The expected output is that both cases print the SparseSeries. However, only the second case is printed properly. The first case throws the following exception:

Traceback (most recent call last):
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 420, in _get_values
    fastpath=True).__finalize__(self)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 222, in __init__
    self.index = index
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 2685, in __setattr__
    return object.__setattr__(self, name, value)
  File "pandas/src/properties.pyx", line 65, in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44748)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/series.py", line 274, in _set_axis
    labels = _ensure_index(labels)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py", line 3409, in _ensure_index
    return Index(index_like)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py", line 268, in __new__
    cls._scalar_data_error(data)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/indexes/base.py", line 483, in _scalar_data_error
    repr(data)))
TypeError: Index(...) must be called with a collection of some kind, None was passed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/base.py", line 46, in __str__
    return self.__unicode__()
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 306, in __unicode__
    series_rep = Series.__unicode__(self)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/series.py", line 984, in __unicode__
    max_rows=max_rows)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/series.py", line 1025, in to_string
    dtype=dtype, name=name, max_rows=max_rows)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/series.py", line 1052, in _get_repr
    max_rows=max_rows)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/formats/format.py", line 145, in __init__
    self._chk_truncate()
    File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/formats/format.py", line 158, in _chk_truncate
   series = concat((series.iloc[:row_num],
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/indexing.py", line 1296, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/indexing.py", line 1587, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/indexing.py", line 1579, in _get_slice_axis
    return self._slice(slice_obj, axis=axis, kind='iloc')
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/indexing.py", line 99, in _slice
    return self.obj._slice(obj, axis=axis, kind=kind)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/core/series.py", line 578, in _slice
    return self._get_values(slobj)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 422, in _get_values
    return self[indexer]
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 396, in __getitem__
    return self._get_val_at(self.index.get_loc(key))
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/series.py", line 392, in _get_val_at
    return self.block.values._get_val_at(loc)
  File "/Users/bryan/anaconda3/lib/python3.5/site-packages/pandas/sparse/array.py", line 308, in _get_val_at
    if loc < 0:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

output of pd.show_versions()

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.1.1
setuptools: 20.7.0
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: None
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: 0.9.2
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
@bryanwweber
Copy link
Author

In Pandas 0.18.0, a different exception is thrown:

import scipy.sparse as sps
import pandas as pd
A = sps.rand(350, 18)
ss = pd.SparseSeries.from_coo(A)
print(ss)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/base.py", line 42, in __str__
    return self.__unicode__()
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/sparse/series.py", line 287, in __unicode__
    series_rep = Series.__unicode__(self)
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/series.py", line 959, in __unicode__
    max_rows=max_rows)
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/series.py", line 1000, in to_string
    dtype=dtype, name=name, max_rows=max_rows)
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/series.py", line 1027, in _get_repr
    max_rows=max_rows)
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/format.py", line 144, in __init__
    self._chk_truncate()
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/core/format.py", line 158, in _chk_truncate
    series.iloc[-row_num:]))
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/tools/merge.py", line 834, in concat
    copy=copy)
  File "/Users/bryan/anaconda3/envs/pandas-test/lib/python3.5/site-packages/pandas/tools/merge.py", line 890, in __init__
    raise TypeError("cannot concatenate a non-NDFrame object")
TypeError: cannot concatenate a non-NDFrame object

pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 20.7.0
Cython: None
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None

@jreback
Copy link
Contributor

jreback commented May 11, 2016

hmm, that seems odd. welcome for you to take a look!

@jreback jreback added Bug Output-Formatting __repr__ of pandas objects, to_string Sparse Sparse Data Type Difficulty Intermediate labels May 11, 2016
@jreback jreback added this to the 0.18.2 milestone May 11, 2016
@jreback
Copy link
Contributor

jreback commented May 11, 2016

cc @sinhrks

@kawochen kawochen mentioned this issue May 11, 2016
18 tasks
@sinhrks
Copy link
Member

sinhrks commented May 11, 2016

Yeah currently indexing doesn't work well with MultiIndex...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants