-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly calculate doc count error at the slice level for concurrent segment search #11732
Correctly calculate doc count error at the slice level for concurrent segment search #11732
Conversation
❌ Gradle check result for 1f31849: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
1f31849
to
4b4aacd
Compare
❌ Gradle check result for 4b4aacd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
17873d6
to
1500ef2
Compare
1500ef2
to
eda7d13
Compare
abbc40f
to
7e64bf8
Compare
❌ Gradle check result for 7e64bf8: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@sohami @reta I updated the PR and also added some write ups related to the big issues we're seeing here. Both are referenced by code comments as well. See:
Taking a small step back, I do think the way that these terms aggregators are currently written is quite fragile and each piece of information seems tacked on to the previous. |
7e64bf8
to
662ab41
Compare
Created a new docs issue to add a lot of the details discussed here to the opensearch docs. The terms and sig terms docs are very lacking currently. |
...src/internalClusterTest/java/org/opensearch/search/aggregations/bucket/ShardSizeTermsIT.java
Outdated
Show resolved
Hide resolved
662ab41
to
7571351
Compare
❌ Gradle check result for 7571351: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
org.opensearch.search.aggregations.bucket.TermsDocCountErrorIT.testDoubleValueFieldSingleShard {p0={"search.concurrent_segment_search.enabled":"true"}} Looks like there are a few more tests I need to fix actually. I'm going to just force merge all of the shards in this test case to 1 segment since the purpose of this test class is to run assertions on the doc count error and we know that there will be a difference in doc count error when we use concurrent search. Will add a comment explaining such. |
… segment search. Change slice_size heuristic to be equal to shard_size. Signed-off-by: Jay Deng <jayd0104@gmail.com>
7571351
to
71a7472
Compare
❌ Gradle check result for 71a7472: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for 71a7472: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
… segment search. Change slice_size heuristic to be equal to shard_size. (#11732) (#11859) (cherry picked from commit b042688) Signed-off-by: Jay Deng <jayd0104@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
… segment search. Change slice_size heuristic to be equal to shard_size. (opensearch-project#11732) Signed-off-by: Jay Deng <jayd0104@gmail.com>
… segment search. Change slice_size heuristic to be equal to shard_size. (opensearch-project#11732) Signed-off-by: Jay Deng <jayd0104@gmail.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
This PR fixes how doc count error is calculated for terms aggregations for conurrent segment search and sets the
slice_size
heuristic introduced in #11585 to be equal to theshard_size
.The crux of this change is the introduction of
hasSliceLevelDocCountError
, to indicate to the coordinator whether or not theslice_size
is the reason for why the error exists. This affects how the top level aggregation error is then calculated in the single shard (and single slice) scenarios.Related Issues
Resolves #11680
Resolves #11702
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.