[Search] Search cancelation is fragile, searches might be running for days #106395
Labels
bug
Fixes for quality problems that affect the customer experience
Feature:Search Sessions
Feature:Search
Querying infrastructure in Kibana
impact:high
Addressing this issue will have a high level of impact on the quality/strength of our product.
loe:large
Large Level of Effort
performance
TLDR: default search expiration is 7 days. If Kibana for some reason failed to delete an async search, then it could keep running for days causing a redundant excessive load on a cluster.
Version: since 7.12
This came up from a quick investigation of how we do the cancelations after a customer noticed that there are hanging unexpected async searches in their cluster. It appears there are currently multiple scenarios where Kibana could initiate a long-running search with a 7 days expiration limit and never clean it up itself.
As I understand, this is how it currently works when search sessions are enabled (default):
Start a search. Don't save a search session. Expiration for the search is set for 7 days. This means that elasticsearch will search for 7 days or until Kibana deletes the search.
Kibana deletes searches in the following scenarios:
The problem with this setup:
So if the browser doesn't act on
search:timeout
(bug or user navigates away from Kibana) and there's any problem with Kibana session monitoring task (e.g. a bug, like #105726, or monitoring task might be turned off) or Kibana is simply not running, then the search might continue for days.Possible solution:
Do not set such a long expiration
7d
? Somehow approach it another way? Like maybe extend searches from Kibana's monitoring tasks only for persisted sessions?cc @lizozom @lukasolson @elastic-jb I hope my understanding of the current setup is correct.
The text was updated successfully, but these errors were encountered: