-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] from_json ArrayIndexOutOfBoundsException in 24.02 #10659
Comments
I could not reproduce the issue with latest from branch-24.04
|
The test can pass only after this commit in 24.04. It looks like the commit fixed this issue. There are some schema handling in the commit that might solve the indexing issue in |
The issue still exists, at least in a few situations. The problem shows up when we ask CUDF to return data with a specific schema, but CUDF sees something that does not match the schema it expects and decides to return a struct or a list instead of an actual string. Is the remaining case that I know causes something like this. |
I also manually verified that this is working now for 24.04, once you turn on JsonToStructs. I think we can close this as there are other issues to track the remaining JSON work. |
Describe the bug
Calling
from_json
for json with nested structs can cause anArrayIndexOutOfBoundsException
if the provided schema for the nested struct has fewer fields than are present in the json.Steps/Code to reproduce bug
Input file
test.json
Test Setup
CPU
GPU
Cause
The following code uses
index
as an index into bothfrom
andto
types and assumes that they are the same size.Expected behavior
Match Spark behavior
Environment details (please complete the following information)
Additional context
The text was updated successfully, but these errors were encountered: