Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes SparseSeries initiated with dictionary raising AttributeError #16960

Merged
merged 8 commits into from
Jul 19, 2017
Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Groupby/Resample/Rolling

Sparse
^^^^^^

- Bug in SparseSeries raises AttributeError when a dictionary is passed in as data (:issue:`16777`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add double ticks ("``") around both SparseSeries and `AttributeError`?



Reshaping
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/sparse/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,10 +146,10 @@ def __init__(self, data=None, index=None, sparse_index=None, kind='block',
data = data._data

elif isinstance(data, (Series, dict)):
data = Series(data)
if index is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think you might be able to remove this if index is None bizeness (and simply use data.index if needed below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see what if anything breaks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the full test suite with this change and everything is working. I also manually ran a bunch of SparseSeries with different inputs (Series, dicts, with/without indexes) and it does work as intended.

index = data.index.view()

data = Series(data)
res = make_sparse(data, kind=kind, fill_value=fill_value)
data, sparse_index, fill_value = res

Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/sparse/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,18 @@ def setup_method(self, method):
self.ziseries2 = SparseSeries(arr, index=index, kind='integer',
fill_value=0)

def test_constructor_data_input(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to test_constructor_dict_input and move your "see" comment to right below the definition.

# see gh-16905
test_dict = {1: 1}
test_index = [0, 1, 2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't name things with test_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would _test_dict, etc work? I saw that higher up in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @jreback is saying to come up with different names that don't use "test" at all.

test_series = pd.Series({1: 1}, index=test_index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the {1: 1} with test_dict.


arr = SparseSeries(test_dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result =

assert arr.count() == len(test_dict)

arr = SparseSeries(test_series, index=test_index))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops: got an extra right parenthesis there

assert arr.count() == test_series.count()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use assert_sp_series_equal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to use assert_series_equal since assert_sp_series_equal requires two sparse series, even if check_series_type=False.
This appears to result from line 1523 in util.testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you should do is construct your expected result via instantiation as a Series (i.e. pass the dict into the Series constructor). Then pass the Series object into the SparseSeries constructor. That becomes your expected output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second test is testing the creation of a SparseSeries from a Series. The SparseSeries constructor takes either a dict or a Series. Aside from class, the SparseSeries should be equal to the original Series.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay : in light of what you just said, I think you can condense this test a bit and rename as follows:

def test_constructor_dict_input(self):
...
series = pd.Series(constructor_dict, index=index)
expected = SparseSeries(series, index=index)

result = SparseSeries(constructor_dict)
tm.assert_sp_series_equal(result, expected)

And delete everything else below it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I added this test and did find a case that was not handled correctly. I fixed the code and added tests for all options of SparseSeries creation.

def test_constructor_dtype(self):
arr = SparseSeries([np.nan, 1, 2, np.nan])
assert arr.dtype == np.float64
Expand All @@ -109,6 +121,9 @@ def test_constructor_dtype(self):
assert arr.dtype == np.int64
assert arr.fill_value == 0

arr = SparseSeries({1: 1})
assert arr.dtype == np.int64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look, I don't think you need this. You can remove.

def test_iteration_and_str(self):
[x for x in self.bseries]
str(self.bseries)
Expand Down