Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Scrolling datafeeds leave open scroll contexts on error #40772

Closed
dimitris-athanasiou opened this issue Apr 3, 2019 · 1 comment · Fixed by #40773
Closed

[ML] Scrolling datafeeds leave open scroll contexts on error #40772

dimitris-athanasiou opened this issue Apr 3, 2019 · 1 comment · Fixed by #40773
Assignees
Labels
>bug :ml Machine learning

Comments

@dimitris-athanasiou
Copy link
Contributor

Datafeeds that use search/scroll should be clearing the scroll contexts they use after they are done. While they do so under normal circumstances, if there is an error thrown during the search request it appears they leave the scroll contexts open. Given we set the timeout to be 30m, when an index pattern is used that matches many shards, this could mean quite a few of scroll contexts left open for a while. In combination with the change of the default value for the cluster setting search.max_open_scroll_context from unlimited to 500, this could lead to the cluster rejecting other search requests while those scroll contexts wait to be cleared.

Steps to reproduce

There are many ways to reproduce this. Here is one.

  1. Setup a data index foo-1 with a time field (mapped as date) and some docs.

e.g.

POST foo-1/_doc
{
  "time": "2019-01-01T00:00:00Z",
  "some_value": 42.0      
}
  1. Setup another data index foo-2 without that time field.

e.g.

POST foo-2/_doc
{
  "some_value": 42.0      
}
  1. Create a simple count job with a datafeed using index pattern foo-*
  2. Open job and start datafeed

Observed behaviour

Thee datafeed should have an error notification that the search failed because not all indices have a time field to sort on. This is as expected.

Now do:

GET _nodes/stats

and look for search.open_contexts. The number there should be 0 (assuming nothing else is using the cluster`. However, it is not as the datafeed did not clear the scroll context.

@dimitris-athanasiou dimitris-athanasiou added >bug :ml Machine learning labels Apr 3, 2019
@dimitris-athanasiou dimitris-athanasiou self-assigned this Apr 3, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Apr 3, 2019
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Apr 3, 2019
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Apr 3, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this issue May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants