-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: to_json failing on PyPy #40525
BUG: to_json failing on PyPy #40525
Conversation
Looks harmless cc @WillAyd |
looks fine and backport ok. |
@@ -144,6 +144,7 @@ typedef struct __PyObjectEncoder { | |||
enum PANDAS_FORMAT { SPLIT, RECORDS, INDEX, COLUMNS, VALUES }; | |||
|
|||
int PdBlock_iterNext(JSOBJ, JSONTypeContext *); | |||
static Py_ssize_t get_attr_length(PyObject *obj, char *attr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than forward declaring this can you just swap the position the of get_attr_length
and is_simple_frame
functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make that change
return 0; | ||
} | ||
int ret = (get_attr_length(mgr, "blocks") <= 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not really sure about this change. The problem with how this code is currently set up is that when is_simple_frame
is True then we will just serialize whatever .values
is for that DataFrame. However, the rules for extension arrays are a bit different.
Do we have test coverage for DataFrames that only contain 1 block of an extension array? If not I'd be worried this could change things that we aren't testing currently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't see any coverage, have added some simple tests for the one ExtensionBlock
case (which match 1.1.5 behavior). Let me know if this is what you had in mind!
Thanks for the updates. Actually reading through the history a little more I understand the length check that you've added - I didn't realize that was the case before in internals The test isn't really what I was looking for but let's just go ahead and remove that to get this in. For context, the extension arrays I were worried about are likely commented here:
We've had a rather strange history with But yea for now let's just remove the test and merge this in |
Removed the test. Thanks for pointing that out, I see what you mean now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm when green
thanks @mzeitlin11 |
@meeseeksdev backport 1.2.x |
Co-authored-by: Matthew Zeitlin <37011898+mzeitlin11@users.noreply.github.com>
Not sure how best to test this - confirmed this now works when run on PyPy, but since we don't test against PyPy the examples in the issue (or any other simple case) which fail on PyPy will still pass on master.
Benchmarks look similar, might be minor performance improvement for the single block case: