Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix Series.get() for ExtensionArray and Categorical #20885

Merged
merged 9 commits into from
May 9, 2018
22 changes: 16 additions & 6 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3068,13 +3068,23 @@ def get_value(self, series, key):
# if we have something that is Index-like, then
# use this, e.g. DatetimeIndex
s = getattr(series, '_values', None)
if isinstance(s, (ExtensionArray, Index)) and is_scalar(key):
try:
return s[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you just use .get_loc(key) on an Index as well does this work? (perf should be similar). That way we can avoid separating this.

except (IndexError, ValueError):
if is_scalar(key):
if isinstance(s, Index):
try:
return s[key]
except (IndexError, ValueError):

# invalid type as an indexer
pass
# invalid type as an indexer
pass
elif isinstance(s, ExtensionArray):
try:
# This should call the ExtensionArray __getitem__
iloc = self.get_loc(key)
return s[iloc]
except (IndexError, ValueError):

# invalid type as an indexer
pass

s = com._values_from_object(series)
k = com._values_from_object(key)
Expand Down
23 changes: 23 additions & 0 deletions pandas/tests/extension/base/getitem.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,29 @@ def test_getitem_slice(self, data):
result = data[slice(1)] # scalar
assert isinstance(result, type(data))

def test_get(self, data):
# GH 20882
s = pd.Series(data, index=[2 * i for i in range(len(data))])
assert s.get(4) == s.iloc[2]

result = s.get([4, 6])
expected = s.iloc[[2, 3]]
self.assert_series_equal(result, expected)

result = s.get(slice(2))
expected = s.iloc[[0, 1]]
self.assert_series_equal(result, expected)

s = pd.Series(data[:6], index=list('abcdef'))
assert s.get('c') == s.iloc[2]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a test for a slice like s.get(slice(2))? That seems to be valid as far as .get is concerned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your suggestion, I don't think you need all of those changes. (I also don't think they will work because self._engine.get_value() returns an item from the from the passed ndarray and you are using that to index into the series itself). Things work fine in how I did it when specifying a slice or multiple indices. The issue is when the key is a single value, which is what I took care of with my fix.

I'll push some additional tests.

result = s.get(slice('b', 'd'))
expected = s.iloc[[1, 2, 3]]
self.assert_series_equal(result, expected)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some cases with an out-of-range integer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

result = s.get('Z')
assert result is None

def test_take_sequence(self, data):
result = pd.Series(data)[[0, 1, 3]]
assert result.iloc[0] == data[0]
Expand Down