Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49939][SQL] Codegen Support for json_object_keys (by Invoke & RuntimeReplaceable) #48428

Closed
wants to merge 3 commits into from

Conversation

panbingkun
Copy link
Contributor

What changes were proposed in this pull request?

The pr aims to add Codegen Support for json_object_keys.

Why are the changes needed?

  • improve codegen coverage.
  • simplified code.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA & Existed UT (eg: JsonFunctionsSuite#json_object_keys function)

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun panbingkun marked this pull request as ready for review October 12, 2024 16:33
jsonParser.skipChildren();
}
return new GenericArrayData(arrayBufferOfKeys.toArray());
} catch (IOException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you handle JsonProcessingException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If written as one of the following two styles:

  • A.
image
  • B.
image
  • Because:
    JsonProcessingException extends JacksonException and JacksonException extends java.io.IOException

  • So, Based on my knowledge that catch java.io.IOException should already be enough to achieve the goal.

@MaxGekk
Copy link
Member

MaxGekk commented Oct 13, 2024

+1, LGTM. Merging to master.
Thank you, @panbingkun.

@MaxGekk MaxGekk closed this in 54fd408 Oct 13, 2024
@SubhamSinghal
Copy link

@panbingkun Do you plan to add codeGen support for StructsToJson and JsonToStructs expression as well?

@panbingkun
Copy link
Contributor Author

JsonToStructs

Yes!

@SubhamSinghal
Copy link

cool. If you are not planning for StructsToJson, then I will take your code as ref and try to implement it.

@panbingkun
Copy link
Contributor Author

cool. If you are not planning for StructsToJson, then I will take your code as ref and try to implement it.

I don't think StructsToJson(to_json) is suitable for using Invoke & RuntimeReplaceable to support codegen.
A draft implementation can be found at: #48467

@SubhamSinghal
Copy link

SubhamSinghal commented Oct 15, 2024

sorry for naive question, any reason why to_json is not suitable for using Invoke & RuntimeReplaceable? I was checking proto equivalent function toProtobuf expression which extends RuntimeReplaceable

@panbingkun
Copy link
Contributor Author

sorry for naive question, any reason why to_json is not suitable for using Invoke & RuntimeReplaceable? I was checking proto equivalent function toProtobuf expression which extends RuntimeReplaceable

If we use Invoke and RuntimeReplaceable to support Codegen for to_json,

  • Do we need to create a JacksonGenerator every time? Based on my understanding, if we do this, the performance doesn't seem to be optimal.
  • If we only create JacksonGenerator once in to_json, then we have to pass it in the Invoke call, and we must make JacksonGenerator extend Serializable, which will cause more or less changes to the old logic. Variables that may be used in Invoke, such as writer and converter, all have similar issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants