-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: allow scalar setting/getting via float indexer on integer indexes #12370
Conversation
not a lot of code changes, mostly making the tests robust & w/o repeating things. haven't updated docs yet |
self.assertEqual(result4, result6) | ||
for result in [s[5.0], s[5], | ||
s.loc[5.0], s.loc[5], | ||
s.ix[5.0], s[5]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last one here should probably be s.ix[5]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
for idxr in [lambda x: x.loc, | ||
lambda x: x.iloc, | ||
lambda x: x]: | ||
self.assertRaises(TypeError, lambda: idxr(s)[l]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am probably missing something (the diff in the tests is a bit hard to follow), but shouldn't there be a difference between loc and iloc here? As loc is doing label-based slicing, so float is allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are about slicing:
In [1]: s = Series(range(3))
In [2]: s[1.0:2]
TypeError: cannot do slice start value indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [1.0] of <type 'float'>
In [3]: s.loc[1.0:2]
TypeError: cannot do slice indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [1.0] of <type 'float'>
In [4]: s.ix[1.0:2]
Out[4]:
1 1
2 2
dtype: int64
In [5]: s.iloc[1.0:2]
TypeError: cannot do slice start value indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [1.0] of <type 'float'>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but why would slicing be an exception?
In [107]: s.loc[1.0]
Out[107]: 1
In [108]: s.loc[2.0]
Out[108]: 2
In [109]: s.loc[1.0:2.0]
TypeError: cannot do slice indexing on <class 'pandas.indexes.numeric.Int64Index
'> with these indexers [1.0] of <type 'float'>
In s[1.0:2]
, the 1.0
is an integer location, so that should indeed raise, but in s.loc[1.0:2]
, the 1.0
is a label, and this label is found when using scalar indexing (s.loc[1.0]
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same as your comment issue. We don't allow float indexers in slicing at ALL, except for Float64Index
. This is possible to relax, but introduces another layer of complexity here. (this doesn't apply for .ix
, just []
and .loc
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We perfectly allow floats in slices?
In [1]: s = Series(range(3))
In [2]: s.loc[1.0:2]
Out[2]:
1 1
2 2
dtype: int64
In [3]: s.ix[1.0:2]
Out[3]:
1 1
2 2
dtype: int64
In [4]: pd.__version__
Out[4]: u'0.17.1'
2f05c51
to
b194c22
Compare
ok, this is updated
net-net-net. I don't think actually much has changed from 0.17.1. (with the exception that I note when setting with We have dropped the So give it a whirl. If you find missing / unclear tests, let's fix em. The more comprehensive the better. |
Great! Quick note: I think we should also raise on I will try to take a more detailed look later this week. |
why are you calling out this a special case? This is the same path as |
I get
using your branch (so something else as you show in your last comment) |
Which now I note gives an array instead of series, which is also odd (wait a moment, maybe I have done something wrong in checkout out your branch, as this was a series when I first tested it) |
@jorisvandenbossche yeh something odd going on. will update soon. In any event I don't see a reason to make this any different it is very consistent with the other indexers. |
Very strange, it seems to depend on the values in the index:
|
@jorisvandenbossche ok, adding tests for these off integers. |
But to come back to what I originally wanted to say: raising in [].
This raises with floats when slicing in |
oh, I just generalized things. This is particular was not exactly tested :< oh |
@jreback yep, ok, in agreement it is not a special case and should stay as it was in 0.17.1 (your example). The confusion was that is actually behaved differently in the PR, which I tried to point out :-) |
@jorisvandenbossche no, my example is from 0.17.1. But I am NOT treating it as a special case, which means this WILL work the same as why should this raise? |
Because |
@jorisvandenbossche no i disagree, this is label based. Again this is introducing special cases where none should exist. |
It is for slicing (not scalars):
vs
|
ok let me adjust that |
4dae30e
to
980a97c
Compare
@jorisvandenbossche ok updated. I think that we are now consistent with 0.17.1 (and now fully tested). |
This looks good! Thanks a lot |
An edge case: if you have mixed dtype index, you get some inconsistencies:
So the second case ( |
Actually, the fact that |
hmm, I'll put in a fix for that. was not the intention to change the API (at all) |
No, I think it is a good change (although maybe we should have raised a FutureWarning for that ..). With a full string index, |
neh, this is really odd This is in 0.17.1.
These are consistent (and positional). |
And they raised a warning, so I think it is certainly OK that they now both raise (as is currently the case with this PR) |
ok, will push a fix in a minute |
So the only issue here , then, is for the mixed dtype index, where it should also raise. |
actually, I could make a case for |
I added tests, but I didnt' actually change any code. lmk what you think. |
+1 |
on something like this. (from 0.17.1)
but we are a bit more exact in that this is REALLY a |
If we had something like a StringIndex, it would be clear that a float is not possible. But for a plain Index, although it has only string values, .. Hmm, it's a bit a dubious case of course. If it was a KeyError previously, I would maybe keep it that way? (unless that makes it more complicated) |
yeah just going to keep it. pushing in a moment. |
ok all fixed up. simplified the logic a bit so its pretty clear. |
if self.is_integer() or is_index_slice: | ||
return key | ||
return self._convert_slice_indexer(key) | ||
if kind in ['getitem', 'ix'] and is_float(key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where it aises a TypeError
here if its label based but not the right type.
closes #12333