Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw json support #200

Merged
merged 6 commits into from
Mar 17, 2022
Merged

Raw json support #200

merged 6 commits into from
Mar 17, 2022

Conversation

swallez
Copy link
Member

@swallez swallez commented Mar 14, 2022

Fixes #143

Adds withJson methods to object builders. These methods populate an object builder from arbitrary JSON by using its deserializer. This allows creating API objects with a combination of programmatic call to property setters and arbitrary JSON data.

The main use cases are loading search requests, index mappings or documents from arbitrary JSON sources.

@swallez
Copy link
Member Author

swallez commented Mar 15, 2022

@spinscale @dadoonet can you have a look at the docs and the WithJson API for this feature and tell me what you think?

@szabosteve can you do a quick review of the docs (rendered)?

@dadoonet
Copy link
Member

I'm wondering why do we need to provide the jsonpMapper:

CreateIndexRequest req = CreateIndexRequest.of(b -> b
    .index("some-index")
    .withJson(
        jsonStream, 
        client._jsonpMapper() 
    )
);

Would it be possible to have a default one like this:

CreateIndexRequest req = CreateIndexRequest.of(b -> b
    .index("some-index")
    .withJson(jsonStream)
);

@dadoonet
Copy link
Member

About the documentation, I think that each example should be self contained. So for the last section, add again the queryJson:

Reader queryJson = new StringReader(
    "{" +
    "  \"query\": {" +
    "    \"range\": {" +
    "      \"@timestamp\": {" +
    "        \"gt\": \"now-1w\"" +
    "      }" +
    "    }" +
    "  }" +
    "}");

Copy link
Member

@dadoonet dadoonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.



public void query1() throws IOException {
//tag::query
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write:

        {
        //tag::query
        Reader queryJson = new StringReader(
            "{" +
            "  \"query\": {" +
            "    \"range\": {" +
            "      \"@timestamp\": {" +
            "        \"gt\": \"now-1w\"" +
            "      }" +
            "    }" +
            "  }" +
            "}");

        SearchRequest agg1 = SearchRequest.of(b -> b
            .withJson(queryJson, client._jsonpMapper()) //<1>
            .aggregations("max-cpu", a1 -> a1 //<2>
                .dateHistogram(h -> h
                    .field("@timestamp")
                    .calendarInterval(CalendarInterval.Hour)
                )
                .aggregations("max", a2 -> a2
                    .max(m -> m.field("host.cpu.usage"))
                )
            )
            .size(0)
        );

        Map<String, Aggregate> aggs1 = client
            .search(agg1, Void.class) //<3>
            .aggregations();
        //end::query
        }

        {
        //tag::query-and-agg
        Reader queryJson = new StringReader(
            "{" +
            "  \"query\": {" +
            "    \"range\": {" +
            "      \"@timestamp\": {" +
            "        \"gt\": \"now-1w\"" +
            "      }" +
            "    }" +
            "  }" +
            "}");

        Reader aggJson = new StringReader(
            "\"size\": 0, " +
            "\"aggs\": {" +
            "  \"hours\": {" +
            "    \"date_histogram\": {" +
            "      \"field\": \"@timestamp\"," +
            "      \"interval\": \"hour\"" +
            "    }," +
            "    \"aggs\": {" +
            "      \"max-cpu\": {" +
            "        \"max\": {" +
            "          \"field\": \"host.cpu.usage\"" +
            "        }" +
            "      }" +
            "    }" +
            "  }" +
            "}");

        SearchRequest agg2 = SearchRequest.of(b -> b
            .withJson(queryJson, client._jsonpMapper()) //<1>
            .withJson(aggJson, client._jsonpMapper()) //<2>
            .ignoreUnavailable(true) //<3>
        );

        Map<String, Aggregate> aggs2 = client
            .search(agg2, Void.class)
            .aggregations();
        //end::query-and-agg
        }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 78e8ee2

protected abstract B self();

@Override
public B withJson(JsonParser parser, JsonpMapper mapper) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to have a default json mapper that we can use as well?
So the method would be:

public B withJson(JsonParser parser)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 78e8ee2

@spinscale
Copy link
Contributor

So I pretty much only looked at LodingJsonTest.java. Few minor comments.

First, can one input overwrite another? The samples only show things along side. What happens in query2() if both set size? I suppose it gets overwritten?

Second, still thinking about the wording of withJson. With the additivity that probably makes more sense than addJson/ofJson or similar.

Third, what happens on invalid JSON (or duplicate keys)? Early error when trying to parse JSON or is a wrong request sent to Elasticsearch?

The loadIndexDefinition() is a great example, as I expect this to be the number one source of things.

The samples would be a great candidate for text blocks, but I suppose, we need to retain Java BWC here :-)

@swallez
Copy link
Member Author

swallez commented Mar 15, 2022

@spinscale answers below

First, can one input overwrite another? The samples only show things along side. What happens in query2() if both set size? I suppose it gets overwritten?

That's right. You can also overwrite properties loaded from JSON programatically. Key point is that it overwrites existing values and doesn't merge them (this would involve complex rules, both to implement and understand). I will add a section about it in the docs to make it clear and unambiguous.

Second, still thinking about the wording of withJson. With the additivity that probably makes more sense than addJson/ofJson or similar.

Ok, so we keep withJson then 😉

Third, what happens on invalid JSON (or duplicate keys)? Early error when trying to parse JSON or is a wrong request sent to Elasticsearch?

This goes through the object's deserializer ("partial deserialization" as said in the WithJson interface javadoc), so errors are raised on unknown fields or invalid value types. There's no way to send data to Elasticsearch that could not have been built programmatically.

The samples would be a great candidate for text blocks, but I suppose, we need to retain Java BWC here :-)

[Edited] Indeed, perfect use case for Java 15 text blocks. But doc samples are extracted from test classes and this code base targets Java8...

@spinscale
Copy link
Contributor

Regarding your third point: I need to think if this is a good idea or not, as this method offers a nice way out of the problem to provide java client extensions for custom plugin functionality. I'd start with your approach and check for feedback?

Apart from that all you said makes sense! LGTM

@swallez
Copy link
Member Author

swallez commented Mar 16, 2022

Regarding your third point: I need to think if this is a good idea or not, as this method offers a nice way out of the problem to provide java client extensions for custom plugin functionality. I'd start with your approach and check for feedback?

Custom plugins that add new API endpoints and/or new component types for queries, aggregations, etc are a different topic that will be addressed specifically. Typically for components we should have a custom variant that accepts any JSON structure or object that can be serialized/deserialized.

@swallez swallez merged commit 5a2bb06 into main Mar 17, 2022
@swallez swallez deleted the raw-json-support branch March 17, 2022 14:35
@github-actions
Copy link

The backport to 7.17 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-7.17 7.17
# Navigate to the new working tree
cd .worktrees/backport-7.17
# Create a new branch
git switch --create backport-200-to-7.17
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 5a2bb06b661adac122b523a84b52d80768446897
# Push it to GitHub
git push --set-upstream origin backport-200-to-7.17
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-7.17

Then, create a pull request where the base branch is 7.17 and the compare/head branch is backport-200-to-7.17.

@github-actions
Copy link

The backport to 8.1 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-8.1 8.1
# Navigate to the new working tree
cd .worktrees/backport-8.1
# Create a new branch
git switch --create backport-200-to-8.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 5a2bb06b661adac122b523a84b52d80768446897
# Push it to GitHub
git push --set-upstream origin backport-200-to-8.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-8.1

Then, create a pull request where the base branch is 8.1 and the compare/head branch is backport-200-to-8.1.

swallez added a commit that referenced this pull request Mar 17, 2022
swallez added a commit that referenced this pull request Mar 17, 2022
swallez added a commit that referenced this pull request Mar 17, 2022
swallez added a commit that referenced this pull request Mar 17, 2022
@jsnyder4
Copy link

jsnyder4 commented Mar 21, 2022

Can we get a new release with this enhancement? (perhaps 8.1.1 or 8.2.0) I'm new to elasticsearch, so I was reading the official documentation for 8.1 and saw .withJson(). To my surprise, the method is not actually available in 8.1.0, which I didn't realize until finally stumbling on this pull request.

@swallez
Copy link
Member Author

swallez commented Mar 23, 2022

@jsnyder4 it is part of 8.1.1 that was released on March 18th.

@jsnyder4
Copy link

jsnyder4 commented Mar 23, 2022

Yeah, I see that now. Thank you very much. I'm using it now and it works great. Very helpful.

@jepson66
Copy link

hi,swallez
I have two questions:

  1. Using withjson, the original json format will be changed, for example, the original order is script_fields, query, build SearchRequest.of(b -> b
    After .withJson(new StringReader(reqJson)).size(0)), it becomes the order of query and script_fields, so I can't find the data, the original json is executed normally in kinaba
  2. In springboot2.7.1 using elasticsearch of version 8.2.2, RestHighLevelClient cannot be used. According to official documents and test cases, start reporting version conflict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

indices client and xcontent/json support
5 participants