-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster load issues due to "Run beyond timeout" #73443
Comments
Pinging @elastic/kibana-app-arch (Team:AppArch) |
cc @Dosant |
Have we verified that the search tasks are not cleaned up when the user navigates away or closes the browser? If so, that's a bug in Kibana for sure. |
@lukasolson I saw search tasks with multi hour runtimes in production clusters with a description |
@lukasolson I looked into this together with @Dosant and we were under the impression there was no logic in place to do that cleanup after the user has hit "Run beyond timeout". If there is a mechanism for that, we should verify whether it's actually caused by Kibana (e.g. because it doesn't work 100% of the time) or whether those users simply triggered those searches manually or via other integrations. |
Okay, so I spent some time looking into this yesterday. There are a few scenarios where we would want to cancel async search requests:
We are properly handling the first two cases, but not the last two. I was under the impression that the After talking with @lizozom, there's a simple solution for this. We can send the |
I wanted to add that this will not solve this issue entirely. Because there is no prioritization of search requests in Elasticsearch, large async search requests will run at the same priority as internal Kibana requests. As a result, there will still likely be lots of scenarios where large async search requests will take down Kibana. The solution for this would be something inside Elasticsearch. The currently proposed issue for this can be found here: elastic/elasticsearch#37867 |
@lukasolson I've also seen cancel tasks running for the same amount of time as the orphaned searches. Maybe this means even if the cancellation request is sent it won't work every time. Might be worthing looking into this as well:
|
@jimczi @dnhatn I'm not sure exactly where this cancellation would fall, but can one of you say if we have an issue in ES on the cancellation side from @lukasolson 's post? |
Kibana version: 7.7 upwards
Describe the bug: The "Run beyond timeout" feature will let searches run indefinitely when the user clicks the button in the prompt. This can cause issues with cluster loads and in some cases even bring the cluster down because extremely costly searches (running for multiple hours), can go unnoticed and continue to run in Elasticsearch if the user abandons the Dashboard after a while.
Steps to reproduce:
Expected behavior:
There are different options how to improve the situation:
This is probably not a bug, but it's easy to misuse the feature in practice.
The text was updated successfully, but these errors were encountered: