Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursion limit reached when joining period and datetimeindexes #3899

Closed
cpcloud opened this issue Jun 14, 2013 · 17 comments · Fixed by #5101
Closed

recursion limit reached when joining period and datetimeindexes #3899

cpcloud opened this issue Jun 14, 2013 · 17 comments · Fixed by #5101
Assignees
Labels
Bug Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@cpcloud
Copy link
Member

cpcloud commented Jun 14, 2013

one should probably not be trying to join these types of indices. even so, a stack overflow is unacceptable. there should at least be an error message...

code:

In [9]: df = mkdf(10, 10, data_gen_f=lambda *args: randint(2), c_idx_type='p', r_idx_type='dt')

In [10]: s = df.iloc[:5, 0]

In [11]: s.index.join(df.columns, how='outer')
@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

cc #3900

@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

@jreback should joins be allowed here (resulting in an object index)? alternative is to raise a type error disallowing joins between period and datetime indices. more generally, to revisit the issue about datetimeindex joins, should anything besides "date-able" strings and integers be allowed to join, the result being an object index if they were, otherwise raise a typeerror...i think this particular bug might be easy to fix if we choose to raise, but legacy support will most likely be broken since the latter would be more strict with allowed joins...

@jreback
Copy link
Contributor

jreback commented Jun 14, 2013

u can convert period to datetime (and vice versa)
so I think u should do that

though still fuzzy when/if this is useful

@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

i don't think it is that useful and maybe this bug is even a bit academic, lmk if u think this not worth fixing. these kinds of bugs are notoriously difficult because the code path to the cycle has the potential to be enormous. thankfully, i just discovered the cycle, and it's not that long. i believe the problem is that object datetime indices are not being converted to datetime indices and thus self and other dtype keep testing as not equal ad infinitum.

another way to see it is that periodindexes should be kept as object here and thus the dtypes will test equal on the next recursive call and the code will terminate.

either way if the dtypes are not equal the code attempts conversion to object but the __new__ method intercepts and periodindexes always get constructed as periodindexes then join is called recursively and never finishes

@jreback
Copy link
Contributor

jreback commented Jun 14, 2013

that's why it might make sense to only allow certain types of joins (and conversions)
raising otherwise

@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

okay...also this bug hidden by the "fake" unicode error that i submitted a pr for last night. later today i'll put up a table with what i think makes sense for conversions and would be nice if u could suggest additions/deletions...then will fix it according to that. thanks

@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

i think implementing this _assert_can_do_setop method in the index base class is probably a good idea. and then have a _join method (overloaded by subclasses) that is called by join

@cpcloud
Copy link
Member Author

cpcloud commented Jun 14, 2013

@jreback so one solution that i have working is to convert period/datetime indices to the other depending on which is doing the join. the obvs prb here is all period can convert to datetime where as not all datetime can cvt to period. possible solutions:

  1. let periodindex raise when it cannot convert the rhs datetimeindex to period

or

  1. convert everything to datetime indexes

@cpcloud
Copy link
Member Author

cpcloud commented Jun 15, 2013

i think 1) is the best option since it seems weird to have the periodindex join method return a datetimeindex even if the other is a datetimeindex. i think it should raise if the datetimeindex can't be converted to a periodindex and of course convert if it can

@jreback
Copy link
Contributor

jreback commented Jun 15, 2013

let push to 0.12

@jreback
Copy link
Contributor

jreback commented Sep 21, 2013

what are we doing with this? punt/push?

@cpcloud
Copy link
Member Author

cpcloud commented Sep 21, 2013

i'll fix by 0.13

@cpcloud
Copy link
Member Author

cpcloud commented Sep 21, 2013

just need to clarify what the exact issue is

@cpcloud
Copy link
Member Author

cpcloud commented Sep 21, 2013

as in write it down here so that we can discuss it

@jreback
Copy link
Contributor

jreback commented Sep 30, 2013

why don't we just raise for now? and then let's figure out in 0.14 (if it should even be allowed)

@cpcloud
Copy link
Member Author

cpcloud commented Oct 3, 2013

pr in the works

@cpcloud
Copy link
Member Author

cpcloud commented Oct 3, 2013

entire problem was that PeriodIndex.astype(object) was returning an ndarray which would then get converted back to period iindex then reconverted back to raw ndarray and so on until the recursion limit was reeached

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants