Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move frequencies functions to cython #17746

Merged
merged 63 commits into from
Jan 5, 2018

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Oct 2, 2017

Move some more functions from tseries.frequencies up to tslibs.frequencies, update imports in _libs.period to cimports

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@jbrockmendel
Copy link
Member Author

The travis error is a flake8 complaint that get_freq is getting redefined in tseries.frequencies. AFAICT it is confused about scopes, because the original get_freq is a function imported at the module level and the other places get_freq is defined are both methods.

@jreback jreback added the Frequency DateOffsets label Oct 3, 2017
@codecov
Copy link

codecov bot commented Oct 10, 2017

Codecov Report

Merging #17746 into master will decrease coverage by 0.02%.
The diff coverage is 92.3%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17746      +/-   ##
==========================================
- Coverage   91.54%   91.51%   -0.03%     
==========================================
  Files         148      148              
  Lines       48804    48680     -124     
==========================================
- Hits        44676    44550     -126     
- Misses       4128     4130       +2
Flag Coverage Δ
#multiple 89.88% <92.3%> (-0.03%) ⬇️
#single 41.68% <53.84%> (+0.05%) ⬆️
Impacted Files Coverage Δ
pandas/tseries/frequencies.py 95.9% <92.3%> (+1.99%) ⬆️
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a8496b...8129b1c. Read the comment docs.

@jbrockmendel
Copy link
Member Author

Is the flake8 warning a deal-breaker?

@bkmrkr
Copy link

bkmrkr commented Oct 17, 2017 via email

@@ -92,6 +92,7 @@ from tslibs.timezones cimport (
treat_tz_as_dateutil, treat_tz_as_pytz,
get_timezone, get_utcoffset, maybe_get_tz,
get_dst_info)
from tslibs.frequencies cimport _get_rule_month
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is prob unecessary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like. Removing.

@@ -1,4 +1,10 @@
# -*- coding: utf-8 -*-
# cython: profile=False

cpdef object _get_rule_month(object source, object default=*)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these used anywhere but in period.pyx (for cython usage)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a duplicate of it in tslibs.parsing that should be replaced with a cimport.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(things will be easier on my end if we can clean those up in a follow-up; im guessing you'll also suggest de-privatizing _get_rule_month)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine, pls add to the list.

@jreback
Copy link
Contributor

jreback commented Nov 1, 2017

needs a rebase

from tslibs.frequencies cimport (
get_freq_code, get_base_alias, get_to_timestamp_base, _get_freq_str,
_get_rule_month)
from tslibs.frequencies import _MONTH_NUMBERS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this be a cimport?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment _MONTH_NUMBERS is not cdef'd. It is imported by tseries.frequencies, but not used there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, then take it out

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

@@ -609,7 +537,7 @@ def is_unique(self):
def is_unique_asi8(self):
return len(self.deltas_asi8) == 1

def get_freq(self):
def get_freq(self): # noqa:F811 # Disable flake8 false-positive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons unknown, flake8 was producing incorrect warnings about the function get_freq imported above being redefined. Here and on line 141.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jbrockmendel
Copy link
Member Author

Travis error was an ImportError for moto. Rebase and push...

@@ -92,6 +92,7 @@ from tslibs.timezones cimport (
treat_tz_as_dateutil, treat_tz_as_pytz,
get_timezone, get_utcoffset, maybe_get_tz,
get_dst_info)
from tslibs.frequencies cimport _get_rule_month
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

from pandas._libs.tslib import Timedelta
from pandas._libs.tslibs.frequencies import ( # noqa
from pandas._libs.tslibs.frequencies import ( # noqa:F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls don't add a noqa, let's just remove the functions that are causing it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are public functions that should remain, put them in a separate import and indicate this

@@ -149,7 +138,7 @@ def get_freq_group(cls, resostr):
"""
return get_freq_group(cls.get_freq(resostr))

@classmethod
@classmethod # noqa:F811 # Disable flake8 false-positive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls don't use flake warnings like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See discussion from 8 days ago:

For reasons unknown, flake8 was producing incorrect warnings about the function get_freq imported above being redefined. Here and on line 141.

The easiest way to make this these unnecessary is probably #18034.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -609,7 +537,7 @@ def is_unique(self):
def is_unique_asi8(self):
return len(self.deltas_asi8) == 1

def get_freq(self):
def get_freq(self): # noqa:F811 # Disable flake8 false-positive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jbrockmendel
Copy link
Member Author

Since I'm going to have to run asv umpteen times to get reasonable results, I'm inclined to make a separate PR that just does the de-privatizing+centraling of _MONTHS/_MONTH_ALIASES/... and get that out of the way, cut the diff here back down.

@jbrockmendel
Copy link
Member Author

I think this should be good to go.



cpdef str get_freq_str(base, mult=1):
code = _reverse_period_code_map.get(base)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a doc-string

return _base_and_stride(freqstr)[0]


class FreqGroup(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this class to the top

"""
Return frequency code group used for base of to_timestamp against
frequency code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Parameters, Returns



cpdef get_base_alias(freqstr):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Parameters, Returns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return str? (or do we just use object)

# Frequency comparison

def is_subperiod(source, target):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpdef bool

cpdef object get_rule_month(object source, object default='DEC'):
"""
Return starting month of given freq, default is December.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-string


from pandas._libs.tslibs.frequencies import (get_freq_code, get_freq_str,
_INVALID_FREQ_ERROR)
from pandas.tseries.frequencies import (_offset_map,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_offset_map? future PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think _offset_map is intended to be private, don't think its imported outside of tests. (Also if we ever make the immutable thing work, we'll be able to get rid of offset_map altogether)

from pandas.tseries.frequencies import (_offset_map, get_freq_code,
_get_freq_str, _INVALID_FREQ_ERROR,

from pandas._libs.tslibs.frequencies import (get_freq_code, get_freq_str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a test_libfrequencies? for testing the cython impl

@@ -7,6 +7,10 @@
from pandas import (Index, DatetimeIndex, Timestamp, Series,
date_range, period_range)

from pandas._libs.tslibs.frequencies import (_period_code_map, get_rule_month,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move this to test_libfrequencies as seems to be mostly doing that. (could also do pandas/tests/tseries/frequencies/test_libfrequencies.py)

@@ -7,6 +7,10 @@
from pandas import (Index, DatetimeIndex, Timestamp, Series,
date_range, period_range)

from pandas._libs.tslibs.frequencies import (_period_code_map, get_rule_month,
_period_str_to_code,
_INVALID_FREQ_ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these methods marked non-private (_period_st_to_code) used outside of libfrequencies? if so, de-privatize (and cpdef them). same with INVALID_FREQ_ERR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just in tests. If OK I'm going to save the remaining de-privatization and test_libfreqs for a follow-up. Too many active PRs and this one is ready for its AARP membership.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I would rather get it right here. as we progress more and more the PR's by-definition get more fine grained and I care much more about the details. when just moving stuff, sure that doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_INVALID_FREQ_ERROR is used once in the non-cython frequencies. The others are used in tests but otherwise not outside libfreqs

cpdef str get_base_alias(freqstr):
"""
Returns the base frequency alias, e.g., '5D' -> 'D'
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add parameters/returns.

return FreqGroup.FR_DAY
if FreqGroup.FR_HR <= base <= FreqGroup.FR_SEC:
return FreqGroup.FR_SEC
return base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a considtion for base < FR_NS. (and make these if/else).

the final entry should be a raise (as base is out of range).

>>> get_to_timestamp_base(get_freq_code('S')[0])
9000
"""
if base < FreqGroup.FR_BUS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@@ -7,6 +7,10 @@
from pandas import (Index, DatetimeIndex, Timestamp, Series,
date_range, period_range)

from pandas._libs.tslibs.frequencies import (_period_code_map, get_rule_month,
_period_str_to_code,
_INVALID_FREQ_ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I would rather get it right here. as we progress more and more the PR's by-definition get more fine grained and I care much more about the details. when just moving stuff, sure that doesn't matter.

@jbrockmendel
Copy link
Member Author

It'd be nice to close this out since momentum for tslibs refactoring has waned.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. a couple of small fixes if you can do now, can defer larger ones (e.g. add more examples / parameterize) to later, but pls fix the Example and the _assert_depr test name

Returns
-------
is_subperiod : boolean
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later on can you come back and add examples to functions w/o


Returns
-------
is_superperiod : boolean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

-------
rule_month: object (usually string)

Example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> Examples

result = get_rule_month('Q')
assert (result == 'DEC')
result = get_rule_month(offsets.QuarterEnd(startingMonth=12))
print(result == 'DEC')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove the print

is_superperiod, is_subperiod)


def test_get_rule_month():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later on parameterize (or now ok too)

assert (_period_str_to_code('Q-DEC') == 2000)
assert (_period_str_to_code('Q-FEB') == 2002)

def _assert_depr(freq, expected, aliases):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to the top and make a better name for it (and de-preivatize)

@jreback jreback added this to the 0.23.0 milestone Jan 4, 2018
is_superperiod, is_subperiod)


def assert_aliases_deprecated(freq, expected, aliases):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have these listed in deprecations for removal? @gfyoung ?

we should blow these away in 0.23.0, can you create an issue an add to the list?

@jreback jreback merged commit 77ec4e8 into pandas-dev:master Jan 5, 2018
@jreback
Copy link
Contributor

jreback commented Jan 5, 2018

thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the tslibs-freqs3 branch January 23, 2018 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants