Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add pre/post slice execution hook to SearchOperationListener for concurrent segment search to collect search metrics & stats #15136

Closed
jed326 opened this issue Aug 6, 2024 · 3 comments · Fixed by #15153
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Roadmap:Search Project-wide roadmap label Search:Performance v2.17.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@jed326
Copy link
Collaborator

jed326 commented Aug 6, 2024

Is your feature request related to a problem? Please describe

The basic problem I am trying to solve here is that from within a search request (ie. everything that comes after ContextIndexSearcher#search), it’s very difficult to figure out which which search task the current thread is executing against. Because of this, it’s very difficult to collect statistics and metrics from within the thread execution for the search task.

At a very high level, in order to collect metrics and stats during a search request and relate it back to the currently executing search request, we need some way to reference the task id (or some other unique identifier for the request) while the query is running. Before concurrent segment search one such unique identifier could have been the thread id, however with concurrent segment search this invariant is no longer valid as the same request may be served by multiple threads and, vice-versa, the same thread may serve multiple search requests.

One example to illustrate how this is needed is the SearchSlowLog. Today this class collects all of its information from the SearchContext and for requests that take longer than a configurable amount of time T this information is printed to the slow log. However, this information is limited in its usefulness as the only stat reported is the overall duration of the request and there are no breakdowns related to how long different parts of the query took. While the query profiler can partially fill in the gap here, it’s not viable to run all queries through the query profiler due to the non-negligble overhead introduced by the additional collectors when profiling is enabled.

More concretely, when we look at searching against remote data (such as with searchable snapshots and writable warm), much of the search query time will be spent in downloading blocks from the remote repo yet there's no way to gather query level details about this from the Collector as these are implemented at the IndexInput level. For example, for debugging long running queries it would be critical to know the number of blocks being downloaded, the number of bytes being downloaded, and the file cache hits and misses, etc.

Furthermore, looking forward Lucene 10 is introducing a prefetch notion to the IndexInput interface and if want to measure performance gains from any potential prefetch implementations we will need to be able to collect this information per-query.

Describe the solution you'd like

The solution I am suggesting is to add to the existing SearchOperationListener, with a pre/post slice execution phase that will happen right before/after ContextIndexSearcher#Search. This will give the thread executions an easy way to keep a reference to the task ID it is currently working on.

Quick POC: jed326@3bcdfa1

Related component

Search:Performance

Describe alternatives you've considered

RunnableTaskExecutionListener

Today the TaskResourceTrackingService already tracks CPU/memory usage at the task level across threads. It does so by implementing the RunnableTaskExecutionListener interface which has a hook to run before and after each task of the type TaskAwareRunnable and this includes tasks in the concurrent search threadpool. One possible solution is to open up this RunnableTaskExecutionListener as a pluggable interface and this naturally gives all tasks that implement the corresponding runnable the ability to have pre- and post- execution actions.

Why did I not choose this?
The main issue with this approach is that a plugin implementation will typically not want to do the same resource tracking on every single thread pool. So for example if I want to track search related metrics, I would only want to do so on the search and index_searcher (and search_throttled) threadpools, however there’s no way to make that distinction in the existing interface.

As for the existing TaskResourceTrackingService, it does not run within any of the threads that it is tracking and therefore can only get surface level information like the memory allocated and cpu utilization, it would not be able to implement any metrics collectors within the threads.

TaskAwareRunnable

The existing TaskAwareRunnable that is used by the TaskResourceTrackingService actually does have a reference to the task id and it is used in the doRun method to initialize the task tracking. As is there’s no way to access the ThreadContext reference from within the original task execution that is wrapped by the TaskAwareRunnable, however one solution is to allow plugins to provide their own Runnable wrappers to take advantage of this ThreadContext.

Why did I not choose this?
Having multiple plugins provide multiple wrapping implementations will get very complicated very quickly and will likely get into a situation in which the order of wrapping becomes another problem that needs to be solved.

Additional context

No response

@jed326 jed326 added enhancement Enhancement or improvement to existing feature or request untriaged labels Aug 6, 2024
@jed326 jed326 changed the title [RFC] Add pre/post slice execution hook to SearchOperationListener for concurrent segment search [RFC] Add pre/post slice execution hook to SearchOperationListener for concurrent segment search to collect search metrics & stats Aug 6, 2024
@jed326
Copy link
Collaborator Author

jed326 commented Aug 6, 2024

@sohami @reta curious to get your thoughts on this, thanks!

@jed326 jed326 self-assigned this Aug 6, 2024
@andrross andrross added the Roadmap:Search Project-wide roadmap label label Aug 6, 2024
@sohami
Copy link
Collaborator

sohami commented Aug 7, 2024

@jed326 Thanks for creating the proposal. I do see the slice level callbacks to be useful and the use cases makes sense. As you pointed out to capture per slice/search request level information about the blocks accessed or time take to download those vs FileCache hit/miss will be useful to understand the query latency. This may also bring in some opportunity to simplify how search profile related informations are captured in concurrent path.

@reta
Copy link
Collaborator

reta commented Aug 7, 2024

Sorry for delay @jed326 , I also agree with @sohami that such callbacks could be useful since the SearchContext has reference to the search task (for corelation purposes). Thank you.

@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.17.0 labels Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Roadmap:Search Project-wide roadmap label Search:Performance v2.17.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: New
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants