Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTOCUT] Gradle Check Flaky Test Report for AwarenessAttributeDecommissionIT #14290

Open
opensearch-ci-bot opened this issue Jun 13, 2024 · 0 comments · Fixed by #14372
Open
Labels
autocut Cluster Manager flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.

Comments

@opensearch-ci-bot
Copy link
Collaborator

opensearch-ci-bot commented Jun 13, 2024

Flaky Test Report for AwarenessAttributeDecommissionIT

Noticed the AwarenessAttributeDecommissionIT has some flaky, failing tests that failed during post-merge actions.

Details

Git Reference Merged Pull Request Build Details Test Name
b01b6e8 13998 40004 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
b03066c 14044 40069 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
bb84ca7 13916 39562 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
c71060e 13772 40579 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
9d3cf43 13374 40122 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
04a417a 13739 40555 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
5466e90 14366 41057 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
5b19454 13801 39904 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
6ac4e4f 14047 40068 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
7e4a341 14163 40646 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction
7faae25 13577 39980 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction

The other pull requests, besides those involved in post-merge actions, that contain failing tests with the AwarenessAttributeDecommissionIT class are:

For more details on the failed tests refer to OpenSearch Gradle Check Metrics dashboard.

@opensearch-ci-bot opensearch-ci-bot added >test-failure Test failure from CI, local build, etc. autocut untriaged labels Jun 13, 2024
@prudhvigodithi prudhvigodithi added the flaky-test Random test failure that succeeds on second run label Jun 14, 2024
andrross added a commit to andrross/OpenSearch that referenced this issue Jun 14, 2024
The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves opensearch-project#14290
Resolves opensearch-project#12197

Signed-off-by: Andrew Ross <andrross@amazon.com>
andrross added a commit to andrross/OpenSearch that referenced this issue Jun 15, 2024
The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves opensearch-project#14290
Resolves opensearch-project#12197

Signed-off-by: Andrew Ross <andrross@amazon.com>
@reta reta closed this as completed in 0d38d14 Jun 15, 2024
opensearch-trigger-bot bot pushed a commit that referenced this issue Jun 15, 2024
…#14372)

The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves #14290
Resolves #12197

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit 0d38d14)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
reta pushed a commit that referenced this issue Jun 15, 2024
…#14372) (#14376)

The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves #14290
Resolves #12197


(cherry picked from commit 0d38d14)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this issue Jul 12, 2024
…opensearch-project#14372)

The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves opensearch-project#14290
Resolves opensearch-project#12197

Signed-off-by: Andrew Ross <andrross@amazon.com>
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this issue Jul 24, 2024
…opensearch-project#14372) (opensearch-project#14376)

The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves opensearch-project#14290
Resolves opensearch-project#12197

(cherry picked from commit 0d38d14)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: kkewwei <kkewwei@163.com>
wdongyu pushed a commit to wdongyu/OpenSearch that referenced this issue Aug 22, 2024
…opensearch-project#14372)

The problem is that this test would decommission one of six nodes. The
tear down logic of the test would attempt to assert on the health of the
cluster by randomly selecting a node and requesting the cluster health.
If this random check happened to select the node that was
decommissioned, then the test would fail. The fix is to recommission
the node at the end of the test.

Also, the "recommission node and assert cluster health" logic was used
in multiple places and could be refactored out to a helper method.

Resolves opensearch-project#14290
Resolves opensearch-project#12197

Signed-off-by: Andrew Ross <andrross@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autocut Cluster Manager flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.
Projects
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

3 participants