Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Use numexpr for all arithmetic operators + refactor arithmetic operations #4051

Conversation

jtratner
Copy link
Contributor

Resolves #3765.

This is going to be split into two parts and then rebased on top of #3482.

Tasks:

  • Move arithmetic operations to separate file (e.g., ops.py).
  • Explore combining series/panel/frame operation methods.
  • Add rand, rxor, ror, etc...

This adds:

  1. numexpr acceleration to Series, Panel, Panel4D, and DataFrame (SparsePanel has to opt-out b/c doesn't respond to shape)
  2. Adds a test mechanism to numexpr to check that it actually is accelerated.
  3. Moves all arithmetic operators to core/generic and binds them at import to reduce code duplication.
  4. Makes all of Series, Panel, Panel4D, DataFrame, SparsePanel support the entire range of arithmetic operators [previously some were missing].
  5. Cleans up core/expressions slightly
  6. Test cases for all operations.
  7. Improves a few test utilities to be better (error message for isnull and better error for isinstance checks in util/testing)

@cpcloud - I think this would be very easy to add into your current changes with eval. (mostly just changes if hasattr --> getattr to be cleaner).

@jtratner
Copy link
Contributor Author

I can combine a few of the commits if necessary ... just wanted to make the progression clearer.

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2013

This is great. Numexpr ftw.

@jtratner
Copy link
Contributor Author

yay ... another network error :-/

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2013

fixes r on the way

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2013

moving at the speed of travis :|

@jtratner
Copy link
Contributor Author

well, I realized I missed testing the entire set of arithmetic methods in the individual cases (outside of test_expressions), so I guess it's okay. Btw - is it worth it to mark the panel tests as slow in expressions if they add 1.5s to the total suite?

@jtratner
Copy link
Contributor Author

(1.5s total)

@cpcloud
Copy link
Member

cpcloud commented Jun 27, 2013

Is it a single test? If so yes else no

@jtratner
Copy link
Contributor Author

no, it's not.

Now evaluate takes eval_kwargs (generally just truediv) to
numexpr.evaluate plus adds a set of testing methods to check
that numexpr was actually used successfully.

Also changes the if hasattr idiom --> getattr.
Adds a better error for ``isinstance`` checks
as well as a better error message for the ``isnull`` case.
Makes the entire arithmetic test suite explicit as well as sets up test
cases to make sure the default_axis responses do not change.
BUG: Fix ``_fill_zeros`` call to work even if TypeError
(previously was inconsistent).
SparsePanel has to opt-out because it doesn't respond to ``shape``.

ENH: Add flex comparison methods to Series and Panel
@jreback
Copy link
Contributor

jreback commented Jun 27, 2013

@jtratner this real nice..

not that I think ops actually make much sense on a Panel4D, but maybe throw a couple of tests, e.g. maybe adding them or something...?

@jtratner
Copy link
Contributor Author

@jreback okay, will do.

@@ -326,18 +329,35 @@ def _flex_method(op, name):
-------
result : Series
""" % name
# copied directly from _arith_method above...we'll see whether this works
def na_op(x, y):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely random

do you think its possible to consolidate these arithmetic ops out of series/frame/panel and move to say ops.py
(they still prob need to be separate but maybe can somewhat be combined)....

I am refereing to the actual methods, e.g. _flex_method and the sub-functions na_op

just a thought

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was definitely thinking of this. It's possible that much of series and
frame could be combined. I'll explore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gr8......I would STILL be inclined to keep them separate as it reduces code complexity but there may be some easily fixed stuff (though +1 to move to say ops.py)

@jreback
Copy link
Contributor

jreback commented Jul 20, 2013

@jtratner

@cpcloud and I created a new master branch to handle the new eval stuff (which is actually ready to merge, just after 0.13), branch is eval-3393 (and links to the eval PR); feel free to add any of this type of stuff to itt

@jtratner
Copy link
Contributor Author

oh awesome. I'll definitely do that.

On Fri, Jul 19, 2013 at 9:08 PM, jreback notifications@github.com wrote:

@jtratner https://github.com/jtratner

@cpcloud https://github.com/cpcloud and I created a new master branch
to handle the new eval stuff (which is actually ready to merge, just after
0.13), branch is eval-3393 (and links to the eval PR); feel free to add any
of this type of stuff to itt


Reply to this email directly or view it on GitHubhttps://github.com//pull/4051#issuecomment-21285701
.

@jreback
Copy link
Contributor

jreback commented Jul 20, 2013

you can push, just DONT rebase
and make sure you pull before you push, if you can't push let me know

@jtratner
Copy link
Contributor Author

@jreback good to know.

@jreback
Copy link
Contributor

jreback commented Jul 23, 2013

@jtratner after I update #3482 / #4324

I think it would be appropriate to use part of this PR, namely which moved the arithmetic/comp ops to generic.py to cover all of NDFrame (as Series will then inherit too), e.g. your _add_flex.... type methods...(which I guess will live in ops.py?
(this is essentially the 3rd todo in #4324)

the numexpr stuff can be done separately (after)

@jtratner
Copy link
Contributor Author

Yeah, I agree with you on that. Adding numexpr speedup is trivial compared
to normalizing ops between all the parts.

To be clear: do you want me to make a PR and/or push some of these changes
to eval-3393 branch that you have already or do you want me to wait until
after its merged?
On Jul 23, 2013 6:47 PM, "jreback" notifications@github.com wrote:

@jtratner https://github.com/jtratner after I update #3482#3482
#4324 #4324

I think it would be appropriate to use part of this PR, namely which moved
the arithmetic/comp ops to generic.py to cover all of NDFrame (as Series
will then inherit too), e.g. your _add_flex.... type methods...(which I
guess will live in ops.py?
(this is essentially the 3rd todo in #4324#4324
)

the numexpr stuff can be done separately (after)


Reply to this email directly or view it on GitHubhttps://github.com//pull/4051#issuecomment-21451945
.

@jreback
Copy link
Contributor

jreback commented Jul 24, 2013

I think that eval-3393 is pretty independent of the series refactor. what you can do if you want is make a new PR on top of series #3482 (or do after I have merged). The numexpr stuff let's do after both of these PR's are in in your artihmetic/comparison changes.

@jtratner
Copy link
Contributor Author

Makes sense

@cpcloud
Copy link
Member

cpcloud commented Jul 24, 2013

@jreback should probably merge series first then eval...i'll rebase after that merge and then test again to make sure things are working correctly....but shouldn't really matter that much

@jtratner
Copy link
Contributor Author

Now that I think about it, separating out the numexpr part will make this way easier to apply on top of that PR.

@jtratner
Copy link
Contributor Author

Closing as per @jreback 's suggestion and will split into two separate PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: have series/panel arithmetic operators use expressions (numexpr)
3 participants