-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Ability to tz localize when index is implicility in tz #4706
Conversation
@@ -360,6 +360,22 @@ def test_with_tz_ambiguous_times(self): | |||
dr = date_range(datetime(2011, 3, 13), periods=48, | |||
freq=datetools.Minute(30), tz=pytz.utc) | |||
|
|||
def test_with_tz_ambiguous_times_imply_dst(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to change the name of this test to test_imply_dst
and add 1-2 test cases for when imply_dst
is set to True
but it's not actually ambiguous. [Changing the name of the test just makes it easier to find the specific test case and understand what it's doing].
@nehalecky - any comments on this? |
To me, It looks like pytz calls the marker for this
If you have some logic that somehow decides whether or not to infer dst (which I think is what you're doing), you could call it If |
On the exception I was following the code that was already there, however it appears from the pytz source that it just subclasses exception so I will enhance the message. "infer" is the right verb here. I will make those changes. I was planning on adding the docs, both the api and tz handling sections. Also, it seems like the api docs are lacking mention of DatetimeIndex. Is that intentional or just waiting for somebody to add it? On your comment above about adding a comment on the line, was there something in particular you want clarified? If I had to guess it would be the trans_idx line, but want to confirm. |
Thanks - there are a bunch of Exception messages that could do with a rewording - it's much more helpful to have a useful message.
I'm pretty sure it's not intentional. If you want to contribute docs for that - it would be great (though maybe you want to put that in a separate PR, just for clarity).
I just was following everything up to that point - it combines so many things that it's unclear to me. Your call if you want to add a comment explaining what it's doing or not [definitely would want a different set of eyes than mine to look at it then]. I mean this line: |
heh, I got the line wrong too - meant this one |
Also - can you hook up Travis to this? https://github.com/pydata/pandas/blob/master/CONTRIBUTING.md#steps-to-enable-travis-ci |
@jtratner - incorporated your comments. I wasn't sure how to add another pull request for the DatetimeIndex documentation (having an additional branch seemed excessive). Also, I couldn't think of more tests where the fall transition would be obvious without already having a tz-aware index, in which case it could never be localized. I'm open to suggestions for that. As far as I know I added travis, but perhaps I'm mistaken. |
@rockg you got Travis working :) and it definitely makes sense to add the docs with this PR. I'll take a look when I get a chance. |
@rockg nice start to this! I had a the
|
You are correct that this only addresses the Ambiguous error that occurs in the fall transition. I understand your example (it's second-ending time rather than second-beginning, e.g., the time from 1:59:59 or 2:00:00 can really be represented as either 1:59:59 or 2:00:00 as it is really representing the span between). This is always a choice when determining how to represent times and it would be nice if pytz was more flexible. I don't have any easy thoughts here besides changing your times to be second-beginning which I'm sure you thought of. Frequencies less than an hour are supported. The code just looks for a negative diff in times (i.e., a time repeated in the hour) and assumes that these are for the non-dst hour. If there is no such diff, then you still have ambiguous times. I think this logic will work for the vast majority of the cases. Can you point your reference to the number of repeated hours? |
@rockg how's this coming ? |
I believe it's all set. I incorporated @jtratner's comments and have been using it without issue. |
@nehalecky what do you think? you seem to have the best sense of this. |
@rockg and pls squash down to a smaller number of commits |
Sure (and I'll also point to the new api documentation). Is there a help on how to combine the commits? Naively I could back out the current changes and recommit but that seems unnecessary and unwieldy. |
you can combine as you see fit....ideally just put the changes that go together, together. you can also put docs/tests/changes in separate. whatever seems logical. just rebase
|
@rockg to be clear, this means that you'll see something like this:
change "pick" to "squash" and it'll be all good. |
Hey all. @rockg, thanks for your replies and again, nice job! While there is more that can be done with DST transition inference, I think that this is a great start and a very helpful feature! I look forward to its incorporation in pandas and hope to contribute to expanding its functionality in the future. :) |
after you rebase if @jreback and @nehalecky think it looks good, Ithink we can merge this. |
Do I then "push" the rebased commits? |
do git push --force |
Thanks @cpcloud I think this is all in good order. There are two commits, one for the new DatetimeIndex api documentation and then another for the tz_localize change. |
.. autosummary:: | ||
:toctree: generated/ | ||
|
||
DatetimeIndex.delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you take out the generic Index
functions here and just leave the ones that are specific to DatetimeIndex
? thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logical follow-on is that we should document Index as well.
@rockg just run the current vbench to see if there is any degradation given the changes touch some general ts code (I don't think u will see anything though) |
i meant a new vbench that specifically uses |
@rockg ping us when ready with the vbench..thxs |
@jreback Think it's set. The documentation commit is unchanged. |
@rockg can you rebase on master just to make sure.....we just merged in nanosecond support so making sure no issues also can you post the vbench results? |
@jreback Seems like rebasing is potentially destructive so want to make sure I'm doing it right. Would it be:
And for vbench, do you want the entire test set pasted here? I ran some yesterday and noticed a couple ratios of 1.6 but it's not clear if that is a result of something unrelated to my changes or not. |
you may have merge conflicts which you need to resolve you can just post anything > 1.5 or so (or if you think its somewhat related to your changes) |
@jreback Need some help here:
Which sounds like an issue in that I have two commits vs one. Nothing jumps out on the benchmark.
|
# make sure u have a remote ref to upstream
git remote add upstream git://github.com/pydata/pandas.git
git fetch upstream
git rebase upstream/master # put your changes on top of current master
# after that finishes
git rebase -i upstream/master # manually squash/fixup/reword etc.
# force push to your remote branch (the PR)
git push --force |
@rockg those benchmarks look fine....as soon as you rebase we can merge |
you'll need to resolve the merge conflicts in |
Start documentation for DatetimeIndex and Index.
@cpcloud Thanks, I'm almost there. I believe I resolve the conflicts but in the GitHub Mac tool it shows as deleting a bunch of lines, but they are still in the file. Proceed? |
Fix to issue pandas-dev#4230 which allows to localize an index which is implicitly in a tz (e.g., reading from a file) by passing infer_dst to tz_localize.
ENH: Ability to tz localize when index is implicility in tz
Fix to issue #4230 which allows to localize an index which is
implicitly in a tz (e.g., reading from a file) by passing imply_dst to
tz_localize.