-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Fix remote shards balancer when filtering throttled nodes #11724
[BUG] Fix remote shards balancer when filtering throttled nodes #11724
Conversation
Compatibility status:Checks if related components are compatible with change c8d000d Incompatible componentsIncompatible components: [https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git] Skipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git] |
❕ Gradle check result for 4c5408b: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
@kotwanikunal could you take a look at this additional one? |
.../src/main/java/org/opensearch/cluster/routing/allocation/allocator/RemoteShardsBalancer.java
Show resolved
Hide resolved
.../src/main/java/org/opensearch/cluster/routing/allocation/allocator/RemoteShardsBalancer.java
Show resolved
Hide resolved
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
7ab39d9
to
ec30c58
Compare
@kotwanikunal can we merge this |
@bugmakerrrrrr Can we add a unit test for the bug being fixed here? |
@andrross I think that |
❕ Gradle check result for c7ef861: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Signed-off-by: Andrew Ross <andrross@amazon.com>
❌ Gradle check result for c8d000d: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for c8d000d: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for c8d000d: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
* fix remote shards balancer Signed-off-by: panguixin <panguixin@bytedance.com> * add change log Signed-off-by: panguixin <panguixin@bytedance.com> --------- Signed-off-by: panguixin <panguixin@bytedance.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 9f649e0) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…) (#12024) * fix remote shards balancer * add change log --------- (cherry picked from commit 9f649e0) Signed-off-by: panguixin <panguixin@bytedance.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
…search-project#11724) * fix remote shards balancer Signed-off-by: panguixin <panguixin@bytedance.com> * add change log Signed-off-by: panguixin <panguixin@bytedance.com> --------- Signed-off-by: panguixin <panguixin@bytedance.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
…search-project#11724) * fix remote shards balancer Signed-off-by: panguixin <panguixin@bytedance.com> * add change log Signed-off-by: panguixin <panguixin@bytedance.com> --------- Signed-off-by: panguixin <panguixin@bytedance.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
…search-project#11724) * fix remote shards balancer Signed-off-by: panguixin <panguixin@bytedance.com> * add change log Signed-off-by: panguixin <panguixin@bytedance.com> --------- Signed-off-by: panguixin <panguixin@bytedance.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
Today,
RemoteShardsBalancer
usesAllocationDecider#canAllocateAnyShardToNode
to filter throttled or ineligible nodes during allocating unassigned shards. If all eligible nodes of an unassigned shard are filtered before trying to allocate this shard, the shard will be marked as ignored withUnassignedInfo.AllocationStatus.DECIDERS_NO
status. As a result, the correspondingShardRestoreStatus
will be set toFailure
(RestoreService.RestoreInProgressUpdater#unassignedInfoUpdated
). This pull request takes throttled nodes into account and ensures that shards are marked with the appropriate status.Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.