Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: groupby nth w/o dropna can use the cython routines #7569

Closed
jreback opened this issue Jun 25, 2014 · 1 comment
Closed

PERF: groupby nth w/o dropna can use the cython routines #7569

jreback opened this issue Jun 25, 2014 · 1 comment
Labels
Error Reporting Incorrect or improved errors from pandas Groupby Performance Memory or execution speed performance

Comments

@jreback
Copy link
Contributor

jreback commented Jun 25, 2014

see #7568

  • nth is a fair bit slower than first/last which are calling cython routines. In a case where you don't dropna you can simply call the cython aggregation routines
  • side issue is to move the group_last_object/group_nth_object from algos.pyx to generate_code.py (simply move group_last/group_nth from the groupby template to the same as group_count template, which generates the object dtypes)
  • trap and reraise an error like ValueError buffer type mismatch in the cython trials. This is generated when a built in routine tries to use the cython routines, but the function is not defined (but it SHOULD be defined for all dtypes), so this is a trapped bug (in which case it goes to the python path).
@jreback
Copy link
Contributor Author

jreback commented Jul 6, 2018

these are pretty ok now.

@jreback jreback closed this as completed Jul 6, 2018
@jreback jreback modified the milestones: Contributions Welcome, No action Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Groupby Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

1 participant