-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add .iloc attribute to provide location-based indexing #2922
Conversation
Rad. I am really looking forward to this being merged in master. |
can you give a try....let me know any issues? |
also...here's a 'feature' that is included if you specify a 'label' to .loc it will throw a ValueError; This is true EVEN IF the label actually exists and is in the requested axis...'label' means not (integer or slice)
will ALWAYS fail, no matter the index anyone have an issue with that? |
Actually, it does what I expect even with a multiIndex, what is it that you think is missing? Great stuff, I'd use this all the time. |
I hate to bikeshed but what are people's thought on what this should be called? Either |
do you think a purely label based is needed at all? if not i'd vote for |
I actually like iix.
|
@y-p i was missing test cases for mi |
I liked |
+1 for loc and iloc. I also think a purely label-based equivalent (eg one with strict flags) would be helpful. Of course, it just comes down to overloading the amount of slicing options present to new users. As has already been discussed on the mailing list, when labels are numerical, having a clear option for slicing by row and slicing by value, and raising errors for misuse is quite helpful. For example, imagine I have spectral data running from 400.0 - 700.0nm. If users are slicing by value, it's too easy for them to do [400:700] when they mean [400.0:700.0] or [400:700.0] and I'd prefer my programs bring this to their attention than assume their intentions. |
how about |
or |
+1 for decree from wes. |
Also like suggested Edit: I meant integer. |
any consensus on the name?
so...if choose between:
|
+1 In detail: I gave this a day to think about and while I do like On that note, I think using |
Jeff, you had mentioned that you were possibly going to have 2 functions here... one that does strict by-index slicing and one that does strict by-name slicing. If so, why not call the by-index version "iloc" and the by-name version "loc"? |
I wouldn't call it If you know that you are looking for a label (and have chosen |
I wouldn't call it |
so the proposal is then:
I think can get this by essentially making and suggestions for label getter? ? |
well, it would be nice that the existing i could easily be wrong though :) |
your are welcome to have a go! but he basic idea to have a label based getter? (and only labels) |
well i think the idea is good i just think it'll necessarily have to be a new, third, code path (even if it's a mostly trivial one)...I'm looking at the positional/label choice logic in |
ok...the obvious question then.... should we cause an API change where by
|
well, it'd have to be an clearly documented API change then, for sure :) |
@wesm care to chime in? |
(i'm not sure I understand your example by the way...which line do you mean not working? |
you are right my example is wrong....what I actually mean is that integers would ONLY be for labels that match, and not have any positional meaning (float indicies are another issue).... |
btw, I was trying to generate an example of something that's currently a fallback to position-based and would become disallowed and discovered this... In [91]: df = DataFrame(np.random.randn(8, 4), index=[2, 4, 6, 8, 'null', 10, 12, 14])
In [92]: df
Out[92]:
0 1 2 3
2 -0.951922 0.502621 0.346998 -0.784631
4 1.073580 -1.030964 0.783075 0.283990
6 -0.290176 0.236777 -0.042059 -2.613214
8 0.082795 1.196050 -1.983549 2.973472
null -0.345000 -0.998171 1.035359 1.378678
10 1.762567 -0.706646 -1.591715 0.344561
12 -0.219641 -0.786794 0.228584 -0.808036
14 0.411628 0.427615 0.270707 0.160328
In [93]: df.ix[2] # <-- position-based???
Out[93]:
0 -0.290176
1 0.236777
2 -0.042059
3 -2.613214
Name: 6, Dtype: float64
In [94]: df.ix['null']
Out[94]:
0 -0.345000
1 -0.998171
2 1.035359
3 1.378678
Name: null, Dtype: float64 so I presume if you had some big csv where a string happened to pop in where it wasn't supposed to for some reason and you didn't notice it, you'd silently change the semantics of all your integer indexes... :/ (unless there's some explicit data sanitation logic somewhere to handle this...) all the more reason to provide a way to eliminate the ambiguity if possible (just not sure if breaking |
no APIs changed. No actual depreciations (just a note that we could deprecate some) |
ENH: add .iloc attribute to provide location-based indexing
ok....docs are updated, so pls give take a look and let me know any changes |
This is just what the doctor ordered. Really great effort here jeff thanks
|
Yeah, thank you. I've just pull in master and installed. Going to start using this immediately in some current work. Thanks for all the great work, it's really appreciated. |
Hi guys, I am subtracting a series from a dataframe and noticed that I'm getting V1, V2, V1==V2, V1-V2 954.36 954.36 True 0.0 For my intents and purposes, these should all be the same value. Is there |
PS, I should say that the raw data itself is only 2-decimal precision (eg On Mon, Mar 11, 2013 at 2:52 PM, Adam Hughes hughesadam87@gmail.com wrote:
|
see this (and many other questions about this) I guess you could np.round or
|
Thanks for this link jeff. Appreciate it. On Mon, Mar 11, 2013 at 3:04 PM, jreback notifications@github.com wrote:
|
Hi guys, I'm trying to slice a integer-labeled series by values. My series looks series I'd like to slice values out between 50.0 and 53.0. I've tried all the series.ix[50.0:53.0] series[50.0:53.0] series.loc[50.0:53.0] series.iloc[50.0:53.0] I realize these methods aren't necessarily built to slice by values. Is |
try s[(s>50)&(s<53)] On Apr 11, 2013, at 1:05 PM, Adam Hughes notifications@github.com wrote:
|
Hmm, I guess thinking about it more, there's no reason that all the data in On Thu, Apr 11, 2013 at 1:05 PM, Adam Hughes hughesadam87@gmail.com wrote:
|
Thanks. Sorry, got double posted on the list, so pardon my other response. On Thu, Apr 11, 2013 at 1:30 PM, jreback notifications@github.com wrote:
|
I know that u saw my other response answering your question we normally allow a slice operation on the values to apply to the data that is convertible to that type there is a way to modify this by calling .where method directly, see the doc string On Apr 11, 2013, at 5:57 PM, Adam Hughes notifications@github.com wrote:
|
Alright I'll look into it, thanks On Thu, Apr 11, 2013 at 6:08 PM, jreback notifications@github.com wrote:
|
I know this is from a while back - but could we add a note to the docs about how you replace
You can get the functionality of |
there is an example in the indexing docs and 10min IIRC |
okay I'll try to find it and then add it to the docs there with examples when I do. |
Hello, I am using a wrapper that calls dataframe.plot(). As this only returns an http://matplotlib.org/examples/pylab_examples/colorbar_tick_labelling_demo.html Anyone have any luck with this in the past? |
@hugadams you are commening on an older |
Hello, I'm using pandas datastructures in conjunction with other structures in a I read these into a dataframe, and then a store one column (with the same [480.23 480.6 480.96] [480.23 480.59999999999997 480.96] These are from Float64Index structures. I really have no clue if this The real problem here is that when I add or subtract dataframes, these Thanks |
If the float values are getting 'changed' then they are different and may not match very well; aligning on float indexes is probably not a good idea. You can try 0.14.0 which has a better |
best to post on a new issue |
Wow, sorry. I was sending this to the mailing list and must have put an autofilled address to this thread. My apologies. |
See: #7860 |
Updated to include new indexers:
.iloc
for pure integer based indexing.loc
for pure label based indexing.iat
for fast scalar access by integer location.at
for fast scalar access by label locationMuch updated docs, test suite, and example
In the new
test_indexing.py
, you can change the_verbose
flag to True to get more test outputanybody interested can investigate a couple of cases marked
no comp
which are where the newindexing behavior differs from
.ix
(or.ix
doesn't work); this doesn't include cases where aKeyError/IndexError
is raised (but.ix
let's these thru)Also, I wrote
.iloc
on top of.ix
but most methods are overriden, it is possible that this let's something thru that should not, so pls take a lookPlease try this out and let me know if any of the docs or interface semantics are off