-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a new index level setting to limit the total primary shards per node per index #17295
base: main
Are you sure you want to change the base?
Added a new index level setting to limit the total primary shards per node per index #17295
Conversation
❌ Gradle check result for 721865e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 920f71a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17295 +/- ##
============================================
+ Coverage 72.40% 72.46% +0.06%
- Complexity 65554 65594 +40
============================================
Files 5292 5292
Lines 304493 304548 +55
Branches 44218 44231 +13
============================================
+ Hits 220463 220696 +233
+ Misses 65975 65789 -186
- Partials 18055 18063 +8 ☔ View full report in Codecov by Sentry. |
❌ Gradle check result for ed6fb58: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
...ain/java/org/opensearch/cluster/routing/allocation/decider/ShardsLimitAllocationDecider.java
Outdated
Show resolved
Hide resolved
...ain/java/org/opensearch/cluster/routing/allocation/decider/ShardsLimitAllocationDecider.java
Outdated
Show resolved
Hide resolved
… index per node. Added relevant files for unit test and integration test. Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
ed6fb58
to
e76ebb6
Compare
❌ Gradle check result for e76ebb6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
… to RoutingNode.java Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
37316b4
to
9a8643b
Compare
Description
For remote store backed cluster, Segment Replication is used as the replication strategy. With segment replication, segments are created only on primary shard and these segments are copied to the replica shards. As segment creation is CPU intensive, we have observed CPU skew between nodes of the same cluster where primary shards are not balanced.
The earlier attempts to rebalance primary shards across nodes (#6422, #12250) are definitely helping to reduce the skew but they work on the best effort basis and don’t add any constraint.
Implement new setting in OpenSearch:
index.routing.allocation.total_primary_shards_per_node
: An index-level setting to limit primary shards per node for a specific index. Store this limit (indexTotalPrimaryShardsPerNodeLimit) in index metadata, similar to indexTotalShardsPerNodeLimit.This setting will enhance control over primary shard distribution, improving cluster balance and performance management.
The existing ShardsLimitAllocationDecider class already contains the necessary infrastructure and logic to evaluate shard allocation constraints. It has access to the current cluster state, routing information, and methods to check shard counts per node. Given this existing functionality, we propose implementing the new primary shard limit settings within this class. This approach leverages the current decision-making framework, ensuring consistency with existing allocation rules and minimizing code duplication. By extending the ShardsLimitAllocationDecider, we can efficiently integrate the new primary shard limit checks into the existing allocation decision process.
Related Issues
Resolves #17293
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.