Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remote shards balancer and remove unused variables #11167

Merged

Conversation

bugmakerrrrrr
Copy link
Contributor

@bugmakerrrrrr bugmakerrrrrr commented Nov 12, 2023

Description

Two changes here:

  • make sure the primary shard's AllocationStatus be set to DECIDERS_NO when it cannot be allocated to any nodes;
  • remove unused variables

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 5ce38e5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Nov 12, 2023

Compatibility status:

Checks if related components are compatible with change 351649a

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/performance-analyzer.git]

@bugmakerrrrrr
Copy link
Contributor Author

@kotwanikunal can you take a look at this when you are available

@bugmakerrrrrr
Copy link
Contributor Author

bugmakerrrrrr commented Nov 22, 2023

Thanks @bugmakerrrrrr!

make sure the primary shard's AllocationStatus be set to DECIDERS_NO when it cannot be allocated to any nodes

This sounds like a bug. Can you provide details about when this can occur and how this change fixes it?

Yes, this bug occurs when all deciders return NO for each node, because we always set the throttled flag after iterating all nodes.

I just delete this statement to fix it.

@kotwanikunal
Copy link
Member

@kotwanikunal can you take a look at this when you are available

I just got back today. I will take a look at it shortly

@kotwanikunal
Copy link
Member

@bugmakerrrrrr LGTM. Do you mind adding in a changelog entry? I can re-run the gradle check as it looks unrelated.

#11179

@sohami would be great to have a review from you as well!

Copy link
Contributor

✅ Gradle check result for 2f962ba: SUCCESS

Copy link

codecov bot commented Nov 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (50e3666) 71.37% compared to head (351649a) 71.34%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11167      +/-   ##
============================================
- Coverage     71.37%   71.34%   -0.04%     
+ Complexity    59102    59067      -35     
============================================
  Files          4893     4893              
  Lines        277754   277751       -3     
  Branches      40356    40356              
============================================
- Hits         198242   198148      -94     
- Misses        63066    63125      +59     
- Partials      16446    16478      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bugmakerrrrrr
Copy link
Contributor Author

@kotwanikunal done

Copy link
Contributor

❌ Gradle check result for d17f525: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Collaborator

@sohami sohami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. Without the fix it will leave the unassigned shards with reason as throttled whereas in reality it was not eligible to be assigned on any of the available remote capable nodes

Copy link
Contributor

❕ Gradle check result for d17f525: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@kotwanikunal
Copy link
Member

@bugmakerrrrrr Can you please rebase? :)

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
@bugmakerrrrrr
Copy link
Contributor Author

@bugmakerrrrrr Can you please rebase? :)

done

Copy link
Contributor

❕ Gradle check result for 351649a: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@kotwanikunal kotwanikunal added the backport 2.x Backport to 2.x branch label Nov 30, 2023
@kotwanikunal kotwanikunal merged commit edf7861 into opensearch-project:main Nov 30, 2023
32 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 30, 2023
* Fix RemoteShardsBalancer

Signed-off-by: panguixin <panguixin@bytedance.com>

* remove unused variables

Signed-off-by: panguixin <panguixin@bytedance.com>

* run spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

* add change log

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
(cherry picked from commit edf7861)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kotwanikunal pushed a commit that referenced this pull request Nov 30, 2023
* Fix RemoteShardsBalancer



* remove unused variables



* run spotless



* add change log



---------


(cherry picked from commit edf7861)

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
fahadshamiinsta pushed a commit to fahadshamiinsta/OpenSearch270 that referenced this pull request Dec 4, 2023
…oject#11167)

* Fix RemoteShardsBalancer

Signed-off-by: panguixin <panguixin@bytedance.com>

* remove unused variables

Signed-off-by: panguixin <panguixin@bytedance.com>

* run spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

* add change log

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Dec 11, 2023
…oject#11167)

* Fix RemoteShardsBalancer

Signed-off-by: panguixin <panguixin@bytedance.com>

* remove unused variables

Signed-off-by: panguixin <panguixin@bytedance.com>

* run spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

* add change log

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
@bugmakerrrrrr bugmakerrrrrr deleted the fix_remote_shards_balancer branch January 3, 2024 10:24
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…oject#11167)

* Fix RemoteShardsBalancer

Signed-off-by: panguixin <panguixin@bytedance.com>

* remove unused variables

Signed-off-by: panguixin <panguixin@bytedance.com>

* run spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

* add change log

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…oject#11167)

* Fix RemoteShardsBalancer

Signed-off-by: panguixin <panguixin@bytedance.com>

* remove unused variables

Signed-off-by: panguixin <panguixin@bytedance.com>

* run spotless

Signed-off-by: panguixin <panguixin@bytedance.com>

* add change log

Signed-off-by: panguixin <panguixin@bytedance.com>

---------

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants