Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: raise TypeError on most datetime64 reduction ops #3731

Merged
merged 1 commit into from
Jun 6, 2013
Merged

API: raise TypeError on most datetime64 reduction ops #3731

merged 1 commit into from
Jun 6, 2013

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented May 31, 2013

closes #3726.

@cpcloud
Copy link
Member Author

cpcloud commented May 31, 2013

will squash when travis passes

@cpcloud
Copy link
Member Author

cpcloud commented May 31, 2013

@jreback i'm getting line ending problems kind of across the board. tox doesn't like when things end with dos line endings, but gh says to use git config --global core.autocrlf input which preserves the input, but if i run dos2unix on pandas then there 122 files that get converted which seems like it's asking for merge conflicts...i'm a little confused about what to do

@jreback
Copy link
Contributor

jreback commented Jun 1, 2013

I have crlf set to true which means ignore any endings
(I actually edit in windows) though run everything on Linux

@cpcloud
Copy link
Member Author

cpcloud commented Jun 1, 2013

ok it's no big deal i will just stop messing with tox*.ini

@cpcloud
Copy link
Member Author

cpcloud commented Jun 1, 2013

rebased

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

@jreback this is good 2 go if u want 2 merge

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

ok.....why don't you change it to 0.11.1....
also can you do perf check to make sure nothing changes...e.g.

test_perf -b -t

and see if anything >1.15 comes ups (that's related)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

getting an error from that script...

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

i'll fix ti

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

there's a star import since the parser change that wesm made that overrides the dateutil parser

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

what's the error?

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

i fixed

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

that shouldn't matter, it pulls from git

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

this comes after this or just look at the first seven lines of https://github.com/pydata/vbench/blob/master/vbench/git.py

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

in _parse_commit_log there's a from vbench.git import parser which imports the pandas parser, i'll submit a pr to vbench since star imports generally lead to strange bugs. i've already tracked one of these down in some cython in pandas in the past. it was a hellish soup of git bisect and lots of coffee!

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

maybe i will also submit a pr to pandas that allows one to use the git syntax for commit hashes. i always have to git log at least twice to memorize the short hash number...

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

i see some that are > 1.15, but i see those in other vbench runs as well with the same test name. let me investigate...

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

post the bottom of the chart

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

frame_reindex_both_axes_ix                   |   0.3493 |   0.3234 |   1.0801 |
timeseries_add_irregular                     |  17.6566 |  16.2873 |   1.0841 |
frame_add_st                                 |   4.4290 |   4.0510 |   1.0933 |
series_constructor_ndarray                   |   0.0117 |   0.0106 |   1.0970 |
groupby_indices                              |   6.9420 |   6.0651 |   1.1446 |
reindex_multiindex                           |   1.2194 |   1.0020 |   1.2169 |
mask_floats                                  |   3.7626 |   2.9873 |   1.2595 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

that is my most recent commit (which I haven't pushed, just a doc update) vs. master

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

you have numexpr installed right? numpy 1.7.1?

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

INSTALLED VERSIONS
------------------
Python: 2.7.5.final.0
OS: Linux 3.9.4-1-ARCH #1 SMP PREEMPT Sat May 25 16:14:55 CEST 2013 x86_64
LC_ALL: None
LANG: en_US.UTF-8

Cython: 0.20dev
Numpy: 1.8.0.dev-e9e490a
Scipy: 0.13.0.dev-e2c502f
statsmodels: 0.5.0.dev-917da60
    patsy: 0.1.0+dev
scikits.timeseries: 0.91.3
dateutil: 2.1
pytz: 2013b
PyTables: 3.0.0
    numexpr: 2.1.1.dev
matplotlib: 1.4.x
openpyxl: 1.6.2
xlrd: 0.9.3dev
xlwt: 0.7.5
sqlalchemy: 0.8.1
lxml: 3.2.1
bs4: 4.2.1
html5lib: 0.95-dev

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

prob just random....try mask_floats in ipython a couple of times...

can you put up a %prun of doing it?

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

sure thing need a little time...

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

take your time

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

In [18]: paste
data = np.random.randn(10000, 500)
df = pd.DataFrame(data)
df = df.where(df > 0) # create nans
bools = df > 0
## -- End pasted text --

In [19]: mask = isnull(df)

In [20]: timeit bools.astype(float).mask(mask)
timeit 10 loops, best of 3: 40.7 ms per loop

In [21]: timeit bools.astype(float).mask(mask)
10 loops, best of 3: 40.6 ms per loop

In [22]: timeit bools.astype(float).mask(mask)
10 loops, best of 3: 41.2 ms per loop

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

the prun

         485 function calls in 0.047 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.024    0.024    0.024    0.024 necompiler.py:667(evaluate)
        2    0.009    0.004    0.009    0.004 {method 'astype' of 'numpy.ndarray' objects}
        1    0.004    0.004    0.004    0.004 {method 'ravel' of 'numpy.ndarray' objects}
        1    0.004    0.004    0.004    0.004 {operator.inv}
        3    0.003    0.001    0.003    0.001 {method 'copy' of 'numpy.ndarray' objects}
        1    0.001    0.001    0.039    0.039 frame.py:5581(mask)
        1    0.000    0.000    0.047    0.047 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        4    0.000    0.000    0.040    0.010 internals.py:1243(apply)
        7    0.000    0.000    0.000    0.000 internals.py:928(make_block)
        7    0.000    0.000    0.000    0.000 internals.py:985(__init__)
        5    0.000    0.000    0.000    0.000 internals.py:1227(_verify_integrity)
        5    0.000    0.000    0.000    0.000 internals.py:1223(shape)
        2    0.000    0.000    0.009    0.004 common.py:1614(_astype_nansafe)
        7    0.000    0.000    0.000    0.000 frame.py:387(__init__)
        7    0.000    0.000    0.000    0.000 internals.py:1354(_consolidate_check)
        2    0.000    0.000    0.009    0.004 internals.py:250(astype)
        1    0.000    0.000    0.030    0.030 internals.py:498(where)
        1    0.000    0.000    0.029    0.029 internals.py:544(func)
        7    0.000    0.000    0.000    0.000 internals.py:39(__init__)
       68    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.035    0.035 frame.py:5523(where)
        1    0.000    0.000    0.004    0.004 frame.py:889(__invert__)
        7    0.000    0.000    0.000    0.000 generic.py:516(__init__)
        2    0.000    0.000    0.000    0.000 frame.py:1747(as_matrix)
        1    0.000    0.000    0.000    0.000 expressions.py:58(_can_use_numexpr)
        3    0.000    0.000    0.003    0.001 internals.py:134(copy)
        1    0.000    0.000    0.025    0.025 expressions.py:111(_where_numexpr)
        1    0.000    0.000    0.000    0.000 necompiler.py:462(getContext)
       65    0.000    0.000    0.000    0.000 {len}
        5    0.000    0.000    0.000    0.000 generic.py:668(_consolidate_inplace)
        2    0.000    0.000    0.009    0.004 generic.py:529(astype)
        1    0.000    0.000    0.000    0.000 fromnumeric.py:2098(prod)
       18    0.000    0.000    0.000    0.000 {issubclass}
        5    0.000    0.000    0.000    0.000 frame.py:2109(__setattr__)
       23    0.000    0.000    0.000    0.000 index.py:2715(_ensure_index)
        2    0.000    0.000    0.002    0.001 internals.py:1481(copy)
        4    0.000    0.000    0.000    0.000 {numpy.core.multiarray.array}
        4    0.000    0.000    0.000    0.000 numeric.py:273(asarray)
        1    0.000    0.000    0.000    0.000 internals.py:2199(create_block_manager_from_blocks)
       10    0.000    0.000    0.000    0.000 {hasattr}
       17    0.000    0.000    0.000    0.000 internals.py:1173(_get_items)
        1    0.000    0.000    0.001    0.001 frame.py:3392(fillna)
        2    0.000    0.000    0.000    0.000 common.py:1556(is_datetime64_dtype)
        1    0.000    0.000    0.000    0.000 frame.py:531(_init_ndarray)
        2    0.000    0.000    0.000    0.000 numerictypes.py:735(issubdtype)
        2    0.000    0.000    0.000    0.000 generic.py:37(_get_axis_number)
        7    0.000    0.000    0.000    0.000 internals.py:78(set_ref_locs)
        1    0.000    0.000    0.000    0.000 _methods.py:32(_all)
        5    0.000    0.000    0.000    0.000 generic.py:593(_clear_item_cache)
        1    0.000    0.000    0.000    0.000 frame.py:564(_wrap_array)
        1    0.000    0.000    0.000    0.000 {method 'all' of 'numpy.ndarray' objects}
        6    0.000    0.000    0.000    0.000 internals.py:1634(_consolidate_inplace)
        1    0.000    0.000    0.002    0.002 frame.py:2583(reindex)
        2    0.000    0.000    0.009    0.004 internals.py:1299(astype)
       10    0.000    0.000    0.000    0.000 internals.py:1229(<genexpr>)
        1    0.000    0.000    0.002    0.002 frame.py:2690(_reindex_multi)
       15    0.000    0.000    0.000    0.000 internals.py:1225(<genexpr>)
        1    0.000    0.000    0.001    0.001 frame.py:2717(_reindex_columns)
        5    0.000    0.000    0.000    0.000 {sum}
        6    0.000    0.000    0.000    0.000 frame.py:588(_constructor)
        1    0.000    0.000    0.000    0.000 frame.py:5736(_prep_ndarray)
        2    0.000    0.000    0.000    0.000 common.py:1566(is_timedelta64_dtype)
        1    0.000    0.000    0.001    0.001 frame.py:2724(_reindex_with_indexers)
        1    0.000    0.000    0.000    0.000 frame.py:584(axes)
        1    0.000    0.000    0.001    0.001 internals.py:1293(fillna)
        1    0.000    0.000    0.030    0.030 internals.py:1275(where)
        1    0.000    0.000    0.000    0.000 {sorted}
        7    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        2    0.000    0.000    0.000    0.000 common.py:1516(is_integer)
        2    0.000    0.000    0.000    0.000 internals.py:1499(as_matrix)
        1    0.000    0.000    0.000    0.000 frame.py:592(shape)
        5    0.000    0.000    0.000    0.000 {getattr}
        3    0.000    0.000    0.000    0.000 index.py:923(reindex)
        2    0.000    0.000    0.000    0.000 numerictypes.py:667(issubclass_)
        3    0.000    0.000    0.000    0.000 necompiler.py:616(getType)
        1    0.000    0.000    0.001    0.001 frame.py:2638(reindex_axis)
       11    0.000    0.000    0.000    0.000 internals.py:1346(is_consolidated)
        7    0.000    0.000    0.000    0.000 internals.py:130(dtype)
        1    0.000    0.000    0.000    0.000 _methods.py:24(_prod)
        2    0.000    0.000    0.000    0.000 {sys._getframe}
        5    0.000    0.000    0.000    0.000 internals.py:1620(consolidate)
        4    0.000    0.000    0.000    0.000 internals.py:122(shape)
        1    0.000    0.000    0.002    0.002 generic.py:858(copy)
        1    0.000    0.000    0.025    0.025 expressions.py:164(where)
        6    0.000    0.000    0.000    0.000 frame.py:467(_init_mgr)
        1    0.000    0.000    0.000    0.000 {zip}
        1    0.000    0.000    0.000    0.000 internals.py:1359(is_mixed_type)
        1    0.000    0.000    0.000    0.000 internals.py:295(_try_coerce_result)
        3    0.000    0.000    0.000    0.000 index.py:1412(equals)
        1    0.000    0.000    0.000    0.000 internals.py:291(_try_coerce_args)
        5    0.000    0.000    0.000    0.000 {method 'clear' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'copy' of 'dict' objects}
        6    0.000    0.000    0.000    0.000 {method 'pop' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 generic.py:695(_is_mixed_type)
        1    0.000    0.000    0.001    0.001 internals.py:212(fillna)
        7    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        4    0.000    0.000    0.000    0.000 {callable}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

that's about right...so prob just random (this sometimes happens with ne btw.....you can get differening perf on a vbench depending if you are running something else on the machine or even randomness)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

took less time than i thought :)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

well chrome is a process/memory hog and i have 10 tabs open so maybe i should close them b4 running vbench otherwise just gvim and tmux running

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

actually not in the v0.11.1 will add

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

done

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

looks good.....i'll merge in a bit (so you can see if you forgot anyting!)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

ok, i won't be back on github for at least another hour so i will let things mull over and check one more time

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

hahah!

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

did you happen to have an easy vagrant script anywhere.....i have to create another 32-bit with 3.2....as for some reason everything failing (for HDF) on that (but not 64-bit)...weird

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

i've got something half ass that u can copy paste if u want...

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

moved to a gist

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

I got it setup, tricky a bit because I needed python3.2;; now trouble is I want numpy 1.6.1 but it only installs 1.7.1....weird....tryinng to replicate the travis box that's failing...

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

maybe u could do git checkout v1.6.1 && python setup.py install or pip install numpy==1.6.1?

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

apt-get python3-numpy worked....

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

oh ok. i thought u were still having an issue

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

numpy 1.6.1 vectorize is broken for sttrings fi you don't specify the otypes.....weird (if you do then its ok)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

oh darn does this break replace?

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

it shouldn't...i specify the otype

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

also broken for a 0 len array

oh numpy!

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

if u passed the 3.2 travis machine then ok

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

i'm not sure what is more painful: strict conformance to html standard or degenerate numpy cases >:O

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

(btw hand-editing html right now because here are the options w/ lxml: whine about the most minute detail or give the incorrect answer)

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

the fast version of lxml doesn't work with valid html5! this is so f-----!

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

degenerate html cases !

@cpcloud
Copy link
Member Author

cpcloud commented Jun 5, 2013

if u consider the modern web degenerate!

CLN: refactor into a class decorator

DOC: add starter rls and whatsnew notes

TST: add tests

CLN: refactor bottleneck into class decorator

BUG: fix median compatibility issues

DOC: comment up some possibly obtuse looking code

CLN: fix up comments

ENH: add functools.wraps to preserve operation name in error message

DOC: move to 11.1

DOC: minor doc note
@cpcloud
Copy link
Member Author

cpcloud commented Jun 6, 2013

this is ready 2 go

jreback added a commit that referenced this pull request Jun 6, 2013
API: raise TypeError on most datetime64 reduction ops
@jreback jreback merged commit 6dbcc83 into pandas-dev:master Jun 6, 2013
@jreback
Copy link
Contributor

jreback commented Jun 6, 2013

thank you sir

@cpcloud
Copy link
Member Author

cpcloud commented Jun 6, 2013

happy to help. now to CRUSH the html bug...

@cpcloud cpcloud deleted the raise-on-datetime-ufuncs-3726 branch June 8, 2013 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

datetime Series should raise on most numerical ops
2 participants