Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pit for pagination query #2940

Merged
merged 7 commits into from
Aug 23, 2024

Conversation

manasvinibs
Copy link
Member

@manasvinibs manasvinibs commented Aug 15, 2024

Description

Added option to use search after with pit instead of scroll api for pagination query

Related Issues

#2603

Manual test on sample queries - #2941

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

rupal-bq and others added 6 commits August 15, 2024 13:24
* Add search after for join

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Enable search after by default

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* ignore joinWithGeoIntersectNL

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rerun CI with scroll

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code and retrigger CI with search_after true

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code change

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update pit keep alive time with SQL_CURSOR_KEEP_ALIVE

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix scroll condition

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit before query execution

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from join request builder to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused methods

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit in parent class's run()

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add comment for fetching subsequent result in NestedLoopsElasticExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add javadoc for pit handler

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit interface

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit handler unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix failed unit test CI

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix spotless error

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rename pit class and add logs

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix pit delete unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
* Add search after for join

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Enable search after by default

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* ignore joinWithGeoIntersectNL

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rerun CI with scroll

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code and retrigger CI with search_after true

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code change

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update pit keep alive time with SQL_CURSOR_KEEP_ALIVE

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix scroll condition

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit before query execution

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor get response with pit method

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update remaining scroll search calls

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix integ test failures

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from join request builder to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused methods

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from request to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix pit.delete call missed while merge

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move getResponseWithHits method to util class

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* add try catch for create delete pit in minus executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* move all common fields to ElasticHitsExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* add javadoc for ElasticHitsExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add missing javadoc

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Forcing an empty commit as last commit is stuck processing updates

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Manasvini B S <manasvis@amazon.com>
ykmr1224
ykmr1224 previously approved these changes Aug 22, 2024
LOG.info("Created Point In Time {} successfully.", pitId);
return true;
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException("Error occurred while creating PIT.", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the end user response when we encounter these?

Copy link
Member Author

@manasvinibs manasvinibs Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query should fail. User will see exception with the failure message followed by stack trace of underlying execution exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test it and see what is the error code and message that is coming up. You can add fake exception and test it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do this after merging the code as well.

CreatePitResponse pitResponse = execute.get();
pitId = pitResponse.getId();
LOG.info("Created Point In Time {} successfully.", pitId);
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this boolean used for?

Where is false scenario?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I see this boolean is used only in unit tests for assertions. When PIT creation fails, I'm throwing exception instead of returning false to stop the query execution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't change interface methods just for unit tests.
In unit tests, use some other way by checking no exceptions are made and change this method to void.

Comment on lines 189 to 216

Object[] sortFieldValue =
AccessController.doPrivileged(
(PrivilegedAction<Object[]>)
() -> {
try {
return objectMapper.readValue(json.getString(SORT_FIELDS), Object[].class);
} catch (JsonProcessingException e) {
throw new RuntimeException(
"Failed to parse sort fields from JSON string.", e);
}
});
cursor.setSortFields(sortFieldValue);

// Retrieve the SearchSourceBuilder from the JSON field
String searchSourceBuilderBase64 = json.getString("searchSourceBuilder");
byte[] bytes = Base64.getDecoder().decode(searchSourceBuilderBase64);
ByteArrayInputStream streamInput = new ByteArrayInputStream(bytes);
try {
XContentParser parser =
XContentType.JSON
.xContent()
.createParser(xContentRegistry, IGNORE_DEPRECATIONS, streamInput);
SearchSourceBuilder sourceBuilder = SearchSourceBuilder.fromXContent(parser);
cursor.setSearchSourceBuilder(sourceBuilder);
} catch (IOException ex) {
throw new RuntimeException("Failed to get searchSourceBuilder from cursor Id", ex);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we refactor this to a new method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, updated

Comment on lines 89 to 95
if (pit.delete()) {
return SUCCEEDED_TRUE;
} else {
Metrics.getInstance().getNumericalMetric(MetricName.FAILED_REQ_COUNT_SYS).increment();
return SUCCEEDED_FALSE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this makes sense, if we are never returning false from delete method. 91-93 would never get executed.

if you want to throw exception in delete method, catch exception here and emit metric.
If you want to return boolean, don't throw exception in delete method.

Copy link
Member

@vamsi-amazon vamsi-amazon Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible take insipiration from how scroll is executed below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right! Updated the places using delete function to catch exception and log metrics similar to scroll.

Signed-off-by: Manasvini B S <manasvis@amazon.com>
return String.format("%s:%s", type.getId(), encodeCursor(json));
}

private void setSearchRequestString(JSONObject cursorJson, SearchSourceBuilder sourceBuilder) {
try {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we really need output stream here:

try (XContentBuilder builder = XContentFactory.jsonBuilder()) {
sourceBuilder.toXContent(builder, ToXContent.EMPTY_PARAMS);
String jsonString = builder.toString();
searchRequestBase64 = Base64.getEncoder().encodeToString(jsonString.getBytes(StandardCharsets.UTF_8));
}
can you take this after the PR?

@vamsi-amazon
Copy link
Member

I am approving for now. please follow up with the issues created out of the review.

return String.format("%s:%s", type.getId(), encodeCursor(json));
}

private void setSearchRequestString(JSONObject cursorJson, SearchSourceBuilder sourceBuilder) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should rather return the searchRequestBase64 from this method and set it to json at caller side. (it is not a good practice to modify parameter object in a method for readability/maintainability) Please address it in later PR.

@ykmr1224 ykmr1224 merged commit 69853fe into opensearch-project:main Aug 23, 2024
14 of 15 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.x
# Create a new branch
git switch --create backport/backport-2940-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 69853fe99c4ee940fc739373e0ba04418bda71c6
# Push it to GitHub
git push --set-upstream origin backport/backport-2940-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2940-to-2.x.

jzonthemtn pushed a commit to jzonthemtn/sql that referenced this pull request Aug 28, 2024
* Add pit for join queries (opensearch-project#2703)

* Add search after for join

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Enable search after by default

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* ignore joinWithGeoIntersectNL

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rerun CI with scroll

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code and retrigger CI with search_after true

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code change

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update pit keep alive time with SQL_CURSOR_KEEP_ALIVE

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix scroll condition

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit before query execution

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from join request builder to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused methods

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit in parent class's run()

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add comment for fetching subsequent result in NestedLoopsElasticExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update comment

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add javadoc for pit handler

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit interface

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit handler unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix failed unit test CI

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix spotless error

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rename pit class and add logs

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix pit delete unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit for multi query (opensearch-project#2753)

* Add search after for join

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Enable search after by default

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix tests

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* ignore joinWithGeoIntersectNL

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rerun CI with scroll

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* draft

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code and retrigger CI with search_after true

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Address comments

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused code change

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update pit keep alive time with SQL_CURSOR_KEEP_ALIVE

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix scroll condition

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit before query execution

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Refactor get response with pit method

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Update remaining scroll search calls

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix integ test failures

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* nit

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from join request builder to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Remove unused methods

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move pit from request to executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Fix pit.delete call missed while merge

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Move getResponseWithHits method to util class

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* add try catch for create delete pit in minus executor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* move all common fields to ElasticHitsExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* add javadoc for ElasticHitsExecutor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add missing javadoc

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Forcing an empty commit as last commit is stuck processing updates

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Add pit to default cursor

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Run CI without pit unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* Rerun CI without pit unit test

Signed-off-by: Rupal Mahajan <maharup@amazon.com>

* FIx unit tests for PIT changes

Signed-off-by: Manasvini B S <manasvis@amazon.com>

* Addressed comments

Signed-off-by: Manasvini B S <manasvis@amazon.com>

---------

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
Signed-off-by: Manasvini B S <manasvis@amazon.com>
Co-authored-by: Rupal Mahajan <maharup@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants