-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent Segment Search Overview and Aggs HLD #6798
Comments
@sohami The proposal looks awesome. Just couple of comments:
|
@sohami thank you for assembling this one up, quite comprehensive I am confused a bit by Option 3, specifically:
AFAIK reduce happens at CollectorManager level, not collector level (or I misunderstood what the reduce at collector means). May be you could clearly state what should happens in this case and mentions that this is the hybrid of Option 1 and Option 2. |
@navneet1v Thanks for sharing the plugin. The good thing about Option 3 is that any Agg implementation will have a corresponding ResultReader that extends |
@reta I will clarify above. You are right the reduce is managed by CollectorManager across all the collector tree of the slices. This reduce phase happens in the lucene search api. What I meant to say there is for aggs collector in the collector tree, the reduce performed by CollectorManager will be no-op. Later in QueryPhase when the |
Gotha, thanks @sohami
So that basically means we would have to bear a larger (potentially, significantly larger) state from stage to stage? I do understand the reduce stage would be specific per aggregation (that would need some work) but it looks to me to be the right thing to do - implement the proper reduce phase in the CollectorManager without cutting the corners (just an opinion). |
We are not moving it between different stages, it is done in the query phase itself right after the lucene search returns. The aggregation collector state is used today to perform I am working on the PoC and will share that. Probably that will clarify more. |
[Update]: I have made PoC changes to a branch in my repository here. There are few test failures and also found few bugs with existing search path. I will be creating separate issues for the bugs followed by the PRs to fix those. Also will look into the test failures to understand if there is any gap with the approach. |
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : #6798 Searchable Remote Index : #2900 Implementations Concurrent segment search for aggregations: #7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : #7203 Query cancellation support : #986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : #6798 Searchable Remote Index : #2900 Implementations Concurrent segment search for aggregations: #7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : #7203 Query cancellation support : #986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> (cherry picked from commit 7769682) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : #6798 Searchable Remote Index : #2900 Implementations Concurrent segment search for aggregations: #7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : #7203 Query cancellation support : #986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. (cherry picked from commit 7769682) Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : opensearch-project#6798 Searchable Remote Index : opensearch-project#2900 Implementations Concurrent segment search for aggregations: opensearch-project#7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : opensearch-project#7203 Query cancellation support : opensearch-project#986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : opensearch-project#6798 Searchable Remote Index : opensearch-project#2900 Implementations Concurrent segment search for aggregations: opensearch-project#7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : opensearch-project#7203 Query cancellation support : opensearch-project#986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : opensearch-project#6798 Searchable Remote Index : opensearch-project#2900 Implementations Concurrent segment search for aggregations: opensearch-project#7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : opensearch-project#7203 Query cancellation support : opensearch-project#986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
I have nominated and maintainers have agreed to invite Sorabh Hamirwasia(@sohami) to be a co-maintainer. Sorabh has kindly accepted. Sorabh has led the design and implementation of multiple features like query cancellation, concurrent segment search for aggregations and snapshot inter-operability w/remote storage in OpenSearch. Some significant issues and PRs authored by Sorabh for this effort are as follows: Feature proposals Concurrent segment search for aggregations : opensearch-project#6798 Searchable Remote Index : opensearch-project#2900 Implementations Concurrent segment search for aggregations: opensearch-project#7514 Lucene changes to leaf slices for concurrent search: apache/lucene#12374 Moving concurrent search to core : opensearch-project#7203 Query cancellation support : opensearch-project#986 In total, Sorabh has authored 14 PRs going back to Aug 2021. He also frequently reviews contributions from others and has reviewed nearly 20 PRs in the same time frame. I believe Sorabh will be a valuable addition as a maintainer of OpenSearch and will continue to contribute to the success of the project going forward. Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Closing this issue as Concurrent Search is GA and other follow-up items will be tracked in their respective issues. |
Is your feature request related to a problem? Please describe.
Concurrent Search plugin is added as a sandbox plugin with some pending work tracked here. This issue provides overview of current functionality and high level design options for supporting aggregation along with other open items that should be considered
Describe the solution you'd like
Background:
Today, in OpenSearch the indexing and search request is powered by Lucene library. An OpenSearch index consists of multiple shards and each of these shards represents a Lucene index. Each shard or Lucene index can have multiple segments in it. When a search request is served at an index level then it scatters the request to all of its shards (with an upper limit at per node level) and gather responses from all the shards to merge and create a final response to be returned to the user. On each shard the search request is executed sequentially over the segments in that shard. In Lucene, there is support for the concurrent search over multiple segments which was added in OpenSearch as an experimental sandbox plugin. Currently the concurrent search plugins doesn’t support Aggs and have few other gaps which are captured below and will create different issues for it eventually.
Overview of Sequential + Concurrent Segment Search:
In OpenSearch, for each search request the query execution on shard level looks something like below for the sequential query execution. The coordinator level search request can be performed in multiple phases: (i) CanMatch (optional) - Phase to pre-filter shards based on time range, (ii) DFS (optional) - To improve on the relevance of the document score by using global term frequency and document frequency instead of local shard level info (iii) Query - Actual search happens as part of this on each shard where only docIds are returned along with scores and (iv) Fetch - The requested fields for each document is retrieved post query phase. Among all the phases the most expensive will be query phase where all the scoring and document collection takes place across the segments. This is the phase which uses lucene search api and where concurrency is also introduced to search across different segments in the shard (or lucene index).
Current query phase execution have below operations which is performed on single search thread executing that phase.
Example for a simple query:
With concurrent search support the query execution plan will look like below. A Segment Slice is a group of segments which are assigned to each concurrent searcher thread which will perform the search/collection over those assigned segments. Depending upon the document count and segment count all the segments are partitioned into multiple slices by the IndexSearcher. Lucene uses 2 parameters
MAX_SEGMENTS_PER_SLICE (default to 5)
andMAX_DOCS_PER_SLICE (default to 250K)
to create the slice group. While the concurrent searcher threads are performing the search on their assigned slices, the main search thread also get the last slice assigned to it for execution. Upon completion it calls reduce on CollectorManager tree with the list of all the collectors tree created per slice. The reduce mechanism is the way to merge all the collected documents and create final documents list which is used to create or populate the query result in the end. Note: Today if there is any aggregation operation present in the search request the sequential path will be executed instead of concurrent search pathConcurrent segment search flow:
Design Concurrent Search for Aggs:
To support concurrent search model the operators or collectors will need to implement the CollectorManager interface provided by Lucene. Operators such as
TopDocs, TopFields, TopScoreDocs, etc
which are native to Lucene also provides mechanism to create the correspondingCollectorManager
and performreduce
across leaf slices collectors in reduce phase. OpenSearch has to utilize those which are inherently supported and implement for others likeEarlyTerminatingCollector
which are not native to Lucene. Similarly all the aggregation operators are not native to Lucene it is developed only in OpenSearch application layer using the Lucene collector interface.Properties of Aggregators:
BucketCollector
class which implements theCollector
interface of Lucene. BucketCollector is of 2 major types: 1)DeferringBucketCollector
and 2)Aggregator
where Aggregator is further split into 3 categories: a)MetricsAggregator
b)BucketAggregators
and c)NonCollectingAggregator
(for fields with no mapping).InternalAggregations
object for the aggregation tree which is intermediate data structure to hold on to aggregation results. This result is returned to the coordinator to perform the final merge/reduce of results from all the shardsFor supporting concurrency across all aggregators we can use one of the following options with recommended one being Option 3.
Option 1:
CollectorManager
for each of the aggregator or some mechanism of a commonCollectorManager
implementation which can take care of creating differentAggregator
collectors (orBucketCollectors
) per slice. The collectors of each slice needs to have its own state (or internal data structure) which will be populated with documents collected from that sliceAggregators
(orBucketCollectors
) can remain as is which is creating intermediateInternalAggregation
objects to serialize and send to coordinatorPros:
Cons:
Option 2:
CollectorManager
forAggregator
(same as above)CollectorManager::reduce
for Aggregation collectors on shards, create intermediateInternalAggregation
data structures for list ofAggregation
collectors (depending on number of slices) at each level of aggregation tree as compared to a singleAggregation
collector todayInternalAggregation
per aggregator operator in the tree (instead of 1) from a shard and perform the merge/reduce on theseInternalAggregation
object. Coordinator must already be performing this across the shards. For simplicity, we can think of this as it is performing merges ofInternalAggregation
list with 1 object per shard. Now it needs to perform merges for list of list of InternalAggregation with 1 list of objects per shard. This will be true for each level of the aggregation tree and for all the different aggregations in the request.Pros:
Cons:
DEFAULT_BATCHED_REDUCE_SIZE = 512
) such that reduce happens more frequently but that will still consume the CPU cycles.Option 3 (Recommended):
CollectorManager::reduce
in lucene search phase.InternalAggregation
is created. First level of merges/reduce happen at the shard level to merge theInternalAggregation
across slices on the shard. Then it sends the merged output to the coordinatorPros:
Cons:
Option 4:
CollectorManager
for Aggregator (same as above)Pros:
Cons:
Concurrent segment search with Aggs:
Different Agg operators supported in OpenSearch
Describe alternatives you've considered
All the options are listed above
Additional context
Different tasks which will be done in incremental way in addition to what is called out in META issue
The text was updated successfully, but these errors were encountered: