Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully handle concurrent zone decommission action #5542

Merged

Conversation

imRishN
Copy link
Member

@imRishN imRishN commented Dec 13, 2022

Description

This PR introduces changes related to Decommission Service -

  • Add request id to decommission request object
  • Stores the request id to state metadata
  • Uses this metadata when a request comes to identify if its a retried request

Issues Resolved

#4543

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@imRishN imRishN changed the title Decommission/handle concurrency Gracefully handle concurrent zone decommission action Dec 13, 2022
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Dec 13, 2022

Codecov Report

Merging #5542 (4801501) into main (28e9b11) will increase coverage by 0.06%.
The diff coverage is 70.58%.

@@             Coverage Diff              @@
##               main    #5542      +/-   ##
============================================
+ Coverage     70.98%   71.05%   +0.06%     
- Complexity    58669    58685      +16     
============================================
  Files          4763     4764       +1     
  Lines        279945   280022      +77     
  Branches      40418    40425       +7     
============================================
+ Hits         198731   198958     +227     
+ Misses        64988    64750     -238     
- Partials      16226    16314      +88     
Impacted Files Coverage Δ
...sion/awareness/put/DecommissionRequestBuilder.java 0.00% <0.00%> (ø)
...a/org/opensearch/test/OpenSearchIntegTestCase.java 56.35% <0.00%> (-0.32%) ⬇️
...pensearch/common/settings/FeatureFlagSettings.java 50.00% <50.00%> (ø)
...ecommission/awareness/put/DecommissionRequest.java 81.66% <71.42%> (-4.05%) ⬇️
...arch/cluster/decommission/DecommissionService.java 37.85% <75.00%> (+1.96%) ⬆️
...t/action/admin/cluster/RestDecommissionAction.java 80.95% <75.00%> (-4.77%) ⬇️
...h/cluster/decommission/DecommissionController.java 62.83% <86.36%> (+5.68%) ⬆️
.../java/org/opensearch/common/util/FeatureFlags.java 80.00% <88.88%> (+30.00%) ⬆️
...org/opensearch/common/settings/SettingsModule.java 85.81% <100.00%> (+0.29%) ⬆️
server/src/main/java/org/opensearch/node/Node.java 87.17% <100.00%> (+0.01%) ⬆️
... and 495 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member

dreamer-89 commented Dec 14, 2022

Gradle Check (Jenkins) Run Completed with:

Failing due to version bump after 2.4.1 release, #5560. Fixed is merged into main branch. @imRishN : Can you please rebase your branch against latest main branch ?

* What went wrong:
Execution failed for task ':distribution:bwc:bugfix:buildBwcLinuxTar'.
> Building 2.4.1 didn't generate expected file /var/jenkins/workspace/gradle-check/search/distribution/bwc/bugfix/build/bwc/checkout-2.4/distribution/archives/linux-tar/build/distributions/opensearch-min-2.4.1-SNAPSHOT-linux-x64.tar.gz

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@@ -144,12 +145,19 @@ public void startDecommissionAction(
public ClusterState execute(ClusterState currentState) {
// validates if correct awareness attributes and forced awareness attribute set to the cluster before starting action
validateAwarenessAttribute(decommissionAttribute, awarenessAttributes, forcedAwarenessAttributes);
if (decommissionRequest.requestID() == null) {
decommissionRequest.setRequestID(UUIDs.randomBase64UUID());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of setting here. Just generate it in RestAction so that you don't need it set it again. Then your request would also be immutable. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, only the REST layer would be having UUIDs set. We would want the IDs to be there not conditionally. When client pkgs tries to interact with request object directly, won't it be a problem? Also Integ tests are not Rest layer tests. And OpenSearch uses http smoke test pkg for rest integ tests. The current IT pkg doesn't support REST tests. Let me know if this makes sense..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imRishN I was wondering if you set the requestId here itself -

DecommissionRequest createRequest(RestRequest request) throws IOException {
then it can be passed around as is.

if (decommissionRequest.requestID().equals(decommissionAttributeMetadata.requestID()) == false) {
throw new DecommissioningFailedException(
requestedDecommissionAttribute,
"same request is already in status [INIT]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor]: another request instead of same request

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a same request from users perspective as it would be decommissioning same attribute name and attribute value. Although, technically its a different request but these messages are generally user friendly

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Member

@shwetathareja shwetathareja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @imRishN for the changes.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Bukhtawar Bukhtawar merged commit ac3351c into opensearch-project:main Jan 10, 2023
imRishN added a commit to imRishN/OpenSearch that referenced this pull request Jan 10, 2023
…ject#5542)

* Control concurrency and handle retries during decommissioning of awareness attributes

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
imRishN added a commit to imRishN/OpenSearch that referenced this pull request Jan 10, 2023
…ject#5542)

* Control concurrency and handle retries during decommissioning of awareness attributes

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Bukhtawar pushed a commit that referenced this pull request Jan 10, 2023
* Control concurrency and handle retries during decommissioning of awareness attributes

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Bukhtawar pushed a commit that referenced this pull request Jan 10, 2023
* Control concurrency and handle retries during decommissioning of awareness attributes

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
* Control concurrency and handle retries during decommissioning of awareness attributes

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2.5.0 'Issues and PRs related to version v2.5.0'
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants