Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail weight update when decommission ongoing and fail decommission when attribute not weighed away #4839

Merged
merged 8 commits into from
Oct 20, 2022

Conversation

imRishN
Copy link
Member

@imRishN imRishN commented Oct 19, 2022

Description

  1. Fail weight update call if decommission action is in process
  2. Fail Decommission call if the attribute is not weighed away

Issues Resolved

#4838

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@imRishN imRishN requested review from a team and reta as code owners October 19, 2022 15:23
@imRishN imRishN changed the title Wrr validation Fail weight update when decommission ongoing and fail decommission when attribute not weighed away Oct 19, 2022
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

WeightedRouting weightedRouting = weightedRoutingMetadata.getWeightedRouting();
if (weightedRouting.attributeName().equals(decommissionAttribute.attributeName())) {
Double attributeValueWeight = weightedRouting.weights().get(decommissionAttribute.attributeValue());
if (attributeValueWeight == null || attributeValueWeight != 0.0) {
Copy link
Collaborator

@Bukhtawar Bukhtawar Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this value float or double, should we make specific comparisons. For instance Double.valueOf(0.0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is Double. using .equals()

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Fix spotless check

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #4839 (042fa3a) into main (eb33015) will decrease coverage by 0.02%.
The diff coverage is 66.66%.

@@             Coverage Diff              @@
##               main    #4839      +/-   ##
============================================
- Coverage     70.73%   70.70%   -0.03%     
- Complexity    57884    57894      +10     
============================================
  Files          4689     4689              
  Lines        277296   277314      +18     
  Branches      40363    40367       +4     
============================================
- Hits         196137   196075      -62     
- Misses        64873    64968      +95     
+ Partials      16286    16271      -15     
Impacted Files Coverage Δ
...arch/cluster/decommission/DecommissionService.java 32.03% <50.00%> (+2.35%) ⬆️
...search/cluster/routing/WeightedRoutingService.java 87.93% <100.00%> (+7.16%) ⬆️
...n/admin/cluster/node/tasks/get/GetTaskRequest.java 30.30% <0.00%> (-63.64%) ⬇️
.../action/admin/indices/flush/ShardFlushRequest.java 50.00% <0.00%> (-50.00%) ⬇️
...pensearch/action/ingest/DeletePipelineRequest.java 31.25% <0.00%> (-37.50%) ⬇️
.../admin/cluster/node/tasks/get/GetTaskResponse.java 28.57% <0.00%> (-35.72%) ⬇️
...earch/action/admin/indices/flush/FlushRequest.java 52.17% <0.00%> (-34.79%) ⬇️
...ensearch/client/indices/DetailAnalyzeResponse.java 20.54% <0.00%> (-34.25%) ⬇️
...search/search/aggregations/pipeline/EwmaModel.java 24.44% <0.00%> (-26.67%) ⬇️
.../aggregations/support/values/ScriptLongValues.java 46.51% <0.00%> (-23.26%) ⬇️
... and 477 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.


public void ensureNoOngoingDecommissionAction(ClusterState state) {
DecommissionAttributeMetadata decommissionAttributeMetadata = state.metadata().decommissionAttributeMetadata();
if (decommissionAttributeMetadata != null && decommissionAttributeMetadata.status().equals(DecommissionStatus.FAILED) == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if status is SUCCESSFUL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only when Decommission has failed, we can allow the weight update can go through. Not for any other status

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't SUCCESSFUL a final state? The error message of this if condition says: a decommission action is ongoing with status...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A succesful decommission state means the transport execution has completed. The nodes are decommissioned in this state (not able to start pre voting or join). Hence DecommissionAction ongoing means a zone is decommissioned and the cluster rejects pings from decommissioned node

@sachinpkale
Copy link
Member

Build is failing, please fix the build.


public void ensureNoOngoingDecommissionAction(ClusterState state) {
DecommissionAttributeMetadata decommissionAttributeMetadata = state.metadata().decommissionAttributeMetadata();
if (decommissionAttributeMetadata != null && decommissionAttributeMetadata.status().equals(DecommissionStatus.FAILED) == false) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If decommission status is FAILED can certain nodes be out of service or decommissioned while others not as in partially failed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if the decommission is FAILED, joins or prevoting is not blocked on those nodes and hence if some nodes are out of service (which can happen), the weights request can go through

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then the zone isn't fully equipped to take traffic right like 90% nodes are out of service and we place weights 10:1:1 when 10 is for the zone only operating at 10% capacity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment, LGTM otherwise

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Bukhtawar Bukhtawar merged commit 3344738 into opensearch-project:main Oct 20, 2022
imRishN added a commit to imRishN/OpenSearch that referenced this pull request Nov 3, 2022
…en attribute not weighed away (opensearch-project#4839)

* Add checks for decommission before setting weights

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Bukhtawar pushed a commit that referenced this pull request Nov 3, 2022
* Fail weight update when decommission ongoing and fail decommission when attribute not weighed away (#4839)

* Add changes for graceful node decommission (#4586)

* Add delay timeout for decommission request (#4931)

* Integ Tests for Awareness Attribute Decommissioning (#4715)

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: pranikum <109206473+pranikum@users.noreply.github.com>
imRishN added a commit to imRishN/OpenSearch that referenced this pull request Nov 3, 2022
* Fail weight update when decommission ongoing and fail decommission when attribute not weighed away (opensearch-project#4839)

* Add changes for graceful node decommission (opensearch-project#4586)

* Add delay timeout for decommission request (opensearch-project#4931)

* Integ Tests for Awareness Attribute Decommissioning (opensearch-project#4715)

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: pranikum <109206473+pranikum@users.noreply.github.com>
ashking94 pushed a commit to ashking94/OpenSearch that referenced this pull request Nov 7, 2022
…en attribute not weighed away (opensearch-project#4839)

* Add checks for decommission before setting weights

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants