Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search] Set keep_alive parameter in async search #73712

Merged
merged 5 commits into from
Aug 3, 2020

Conversation

lukasolson
Copy link
Member

Summary

Fixes #73443.

There are a few scenarios where we would want to automatically cancel _async_search requests:

  1. When a user navigates away (still within Kibana)
  2. When a user re-submits the query (or changes the query and re-fetches)
  3. When a user navigates outside of Kibana
  4. When a user closes the tab and/or browser

Prior to this PR, we were properly handling the first two cases, but not the last two. This PR updates our usage of _async_search to include the keep_alive property, which is set to 1m. From the documentation:

The keep_alive parameter specifies how long the async search should be available in the cluster. When not specified, the keep_alive set with the corresponding submit async request will be used. Otherwise, it is possible to override such value and extend the validity of the request. When this period expires, the search, if still running, is cancelled. If the search is completed, its saved results are deleted.

We send this parameter in each request to _async_search, which will extend the request by one minute (from the time of the request). If the browser is closed or the user navigates away from Kibana, the requests will no longer be sent, and after one minute, the task in Elasticsearch will be cancelled.

Checklist

Release notes

Kibana now sets the keep_alive parameter to 1m in _async_search requests to Elasticsearch to ensure that search requests are cancelled if a user closes the browser or navigates outside of Kibana before a request completes.

@lukasolson lukasolson requested a review from lizozom July 29, 2020 16:58
@lukasolson lukasolson requested a review from a team as a code owner July 29, 2020 16:58
@lukasolson lukasolson self-assigned this Jul 29, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@lizozom lizozom requested a review from jimczi July 30, 2020 08:30
Copy link
Contributor

@lizozom lizozom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one unrelated comment but the change LGTM.
Note that the keep alive is not checked aggressively so setting it to 1m doesn't mean that the query will be cancelled right away but that's a good workaround if we cannot cancel explicitly in Kibana.

@@ -81,7 +81,7 @@ async function asyncSearch(
const path = encodeURI(request.id ? `/_async_search/${request.id}` : `/${index}/_async_search`);

// Wait up to 1s for the response to return
const query = toSnakeCase({ waitForCompletionTimeout: '100ms', ...queryParams });
const query = toSnakeCase({ waitForCompletionTimeout: '100ms', keepAlive: '1m', ...queryParams });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but it would be good to set the batched_reduce_size to a value greater than the default (5). I am saying this because we don't use partial results in Kibana at the moment so we don't need to create a partial results every 5 shards. Something like 32 or 64 should make the query faster and we can restore the default value when partial results are handled in Kibana ? It's also outside of the scope of this PR so I can open a new one if you like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any risks associated with increasing this value?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, in blocking search the value is set to 512 so I think it's the other way around, 5 is too low and can have a negative impact on queries in terms of memory and speed. The only advantage to set a low value is for partial results so we should set it higher until we expose partial results in Kibana.

@lizozom
Copy link
Contributor

lizozom commented Aug 2, 2020

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@lukasolson lukasolson merged commit 8c6d655 into elastic:master Aug 3, 2020
lukasolson added a commit to lukasolson/kibana that referenced this pull request Aug 3, 2020
* [Search] Set keep_alive parameter in async search

* Revert accidental change

* Add batched_reduce_size

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
lukasolson added a commit to lukasolson/kibana that referenced this pull request Aug 3, 2020
* [Search] Set keep_alive parameter in async search

* Revert accidental change

* Add batched_reduce_size

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
lukasolson added a commit that referenced this pull request Aug 3, 2020
* [Search] Set keep_alive parameter in async search

* Revert accidental change

* Add batched_reduce_size

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
lukasolson added a commit that referenced this pull request Aug 3, 2020
)

* [Search] Set keep_alive parameter in async search (#73712)

* [Search] Set keep_alive parameter in async search

* Revert accidental change

* Add batched_reduce_size

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* Fix test

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster load issues due to "Run beyond timeout"
5 participants