Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use fused types for reshape #22454

Merged
merged 3 commits into from
Sep 18, 2018
Merged

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@topper-123
Copy link
Contributor

I know very little of Cython (or fused), but looked up in the docs, and saw this warning

Note

Support is still somewhat experimental, there may be bugs! 

Are we sure this is mature enough?

@jbrockmendel
Copy link
Member Author

Are we sure this is mature enough?

This is a reasonable concern. The docs don't say what version that warning was added in, but later on that page it suggests that the feature was available in 0.20.0 (... I think) in which case its been around for awhile.

In porting the tempita functions one-by-one I am finding a few places where there might be bugs, but I'm not at all ready to rule out the possibility that the problems are on my end.

Even if we don't go in this direction, hopefully we can produce some good bug reports for cython. In my experience they've been very responsive/helpful.

@topper-123
Copy link
Contributor

topper-123 commented Aug 22, 2018

Yeah, but my concern is if bugs in pandas' behaviour originates from Cython, then we might get situations where:

  • pandas 0.24 requires cython 0.28.2+
  • pandas 0.24.1 requires cython 0.28.6+ (because of a bug in cython 0.28.5)
  • pandas 0.24.2 requires cython 0.29 (a bug in Cython 0.28.6...)
  • etc.

which would be annoying and maybe a bit unstable. This could also happen from other bugs in cython, but the risk is probably greater for the fused API because it is declared not bug-free by the cython team...

I don't have the ability to make a recommendation on this, but take it as a thing to consider and check for. Maybe consult with the Cython team whether fused types can be considered stable now?

@codecov
Copy link

codecov bot commented Aug 23, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@55d176d). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #22454   +/-   ##
=========================================
  Coverage          ?   92.04%           
=========================================
  Files             ?      169           
  Lines             ?    50773           
  Branches          ?        0           
=========================================
  Hits              ?    46734           
  Misses            ?     4039           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.45% <ø> (?)
#single 42.23% <ø> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55d176d...eea4a53. Read the comment docs.

@gfyoung
Copy link
Member

gfyoung commented Aug 25, 2018

Maybe consult with the Cython team whether fused types can be considered stable now?

@scoder :

Hi, I noticed that you are one of the main commiters to the Cython repository, and we had a question regarding fused types. We here at pandas were planning on using them in place of tempita for some of our source code as you can see in this PR. However, there's a note in the docs from >5 years ago (at least according to the commit history) regarding the instability of fused types. Is this still the case?

Happy to formally open an issue if that is preferred (this is more of a usage question, hence the "cold" mention for lack of a better term). Thanks!

@gfyoung gfyoung added Refactor Internal refactoring of code Internals Related to non-user accessible pandas implementation labels Aug 25, 2018
@scoder
Copy link

scoder commented Aug 25, 2018

Thanks for asking, I removed that note from the docs.

@gfyoung
Copy link
Member

gfyoung commented Aug 25, 2018

@scoder : Thanks for the timely reply! Greatly appreciated.

@topper-123 : Does that response alleviate your concerns?

@topper-123
Copy link
Contributor

👍

Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

cc @jreback

cdef:
Py_ssize_t i, j, w, nulls, s, offset

if reshape_t is not object:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we make the inner a function instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No dice; we'd need a "nogil" version and a not-nogil version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think like long-term cython may implement a feature discussed in #22452 that may let us clear this up further. For now this is much more readable than the existing tempita implementation.

@jbrockmendel
Copy link
Member Author

@jreback gentle ping

@jreback jreback added this to the 0.24.0 milestone Sep 18, 2018
@jreback
Copy link
Contributor

jreback commented Sep 18, 2018

ok this is fine (rebase the other fused impls after this merge).

@jreback jreback merged commit 6151ba6 into pandas-dev:master Sep 18, 2018
@jbrockmendel jbrockmendel deleted the temp_reshape branch September 18, 2018 13:51
aeltanawy pushed a commit to aeltanawy/pandas that referenced this pull request Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants