-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Concurrent Segment Search] Explore different metrics/stats which will be useful with concurrent segment search #7359
Comments
Existing metrics:
|
New Metrics:
Sample requests for reference (without new metrics):
Reference PRs for PIT changes: |
Thanks @sohami , two more to suggest (the naming could be better expressed):
These metrics should help with proper index searcher thread pool sizing I think. |
@reta thanks for the suggestion! It seems like these metrics should go under |
Threadpool queue size stats is available for all threadpool via |
|
public CommonStats getTotal() { | |
if (total != null) { | |
return total; | |
} | |
CommonStats stats = new CommonStats(); | |
for (ShardStats shard : shards) { | |
stats.add(shard.getStats()); | |
} | |
total = stats; | |
return stats; | |
} |
This makes it difficult to compute the average concurrency across all of the shards in 2 ways. First, we only want to consider shards that have a value > 0 for average concurrency because average concurrency only considers requests that use concurrent search. Second, we need some way to track the number of shards with value > 0 that are in the overall response and take that into consideration.
thread_pool.pool_wait_time
The existing thread_pool
metrics come from the ThreadPoolExecutor
class in java.util.concurrent
. See https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ThreadPoolExecutor.html for more details.
OpenSearch/server/src/main/java/org/opensearch/threadpool/ThreadPool.java
Lines 384 to 396 in aca2e9d
if (holder.executor() instanceof ThreadPoolExecutor) { | |
ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) holder.executor(); | |
threads = threadPoolExecutor.getPoolSize(); | |
queue = threadPoolExecutor.getQueue().size(); | |
active = threadPoolExecutor.getActiveCount(); | |
largest = threadPoolExecutor.getLargestPoolSize(); | |
completed = threadPoolExecutor.getCompletedTaskCount(); | |
RejectedExecutionHandler rejectedExecutionHandler = threadPoolExecutor.getRejectedExecutionHandler(); | |
if (rejectedExecutionHandler instanceof XRejectedExecutionHandler) { | |
rejected = ((XRejectedExecutionHandler) rejectedExecutionHandler).rejected(); | |
} | |
} | |
stats.add(new ThreadPoolStats.Stats(name, threads, queue, active, rejected, largest, completed)); |
Since wait time is not provided by the executor class, we would need to provide our own wait time calculation. Since this is a pretty involved change and affects all threadpools I will create a separate issue to track this since I do believe wait time is a valuable metric to have.
Tracking the remaining metrics in separate issues: |
Placeholder tasks to explore and add different metrics which will be useful for concurrent segment search execution model. These metrics can: i) provide insights into the performance of shard level requests (min/max/avg latencies across request at index/node level), ii) how many requests used concurrent search path vs sequential path iii) concurrency used across the requests at index/node level, etc
The text was updated successfully, but these errors were encountered: