Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RemoteStoreIT.testRemoteTranslogRestore flaky test failure #6188

Closed
andrross opened this issue Feb 4, 2023 · 11 comments
Closed

[BUG] RemoteStoreIT.testRemoteTranslogRestore flaky test failure #6188

andrross opened this issue Feb 4, 2023 · 11 comments
Assignees
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework v2.7.0

Comments

@andrross
Copy link
Member

andrross commented Feb 4, 2023

The following failed with a timeout, but is not reproducible:

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.RemoteStoreIT.testRemoteTranslogRestore" -Dtests.seed=DF38037B9405A5AE -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=de-CH -Dtests.timezone=America/North_Dakota/Beulah -Druntime.java=17

org.opensearch.remotestore.RemoteStoreIT > testRemoteTranslogRestore FAILED
    java.lang.AssertionError: timed out waiting for yellow state
        at __randomizedtesting.SeedInfo.seed([DF38037B9405A5AE:C4871B4D04A1356B]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.opensearch.test.OpenSearchIntegTestCase.ensureColor(OpenSearchIntegTestCase.java:1007)
        at org.opensearch.test.OpenSearchIntegTestCase.ensureYellowAndNoInitializingShards(OpenSearchIntegTestCase.java:960)
        at org.opensearch.remotestore.RemoteStoreIT.verifyRestoredData(RemoteStoreIT.java:143)
        at org.opensearch.remotestore.RemoteStoreIT.testRemoteTranslogRestore(RemoteStoreIT.java:181)

(#6182) https://build.ci.opensearch.org/job/gradle-check/10624/consoleFull

/cc @sachinpkale

@andrross andrross added bug Something isn't working untriaged labels Feb 4, 2023
@sachinpkale sachinpkale added Storage:Durability Issues and PRs related to the durability framework v2.6.0 'Issues and PRs related to version v2.6.0' and removed untriaged labels Feb 4, 2023
@sachinpkale sachinpkale self-assigned this Feb 4, 2023
@sachinpkale
Copy link
Member

Thanks @andrross , I will take a look and fix.

@sachinpkale
Copy link
Member

sachinpkale commented Feb 4, 2023

The changes in #6086 should fix this flakiness of the test, will check further.

@sachinpkale
Copy link
Member

The backport of #6086 is still in flight. Once the changes are backported, this test should not fail.

@ashking94
Copy link
Member

PR build is in progress for the backport PR. #6170 - we can merge it once the build is successful.

@ashking94
Copy link
Member

The above PR has been merged. Closing it.

@reta reta reopened this Feb 8, 2023
@reta
Copy link
Collaborator

reta commented Feb 8, 2023

Still happening on 2.x:

Failed
org.opensearch.remotestore.RemoteStoreIT.testRemoteTranslogRestore

java.lang.AssertionError: timed out waiting for yellow state
	at __randomizedtesting.SeedInfo.seed([7E138B60D5A3FE42:65AC935645076E87]:0)
	at org.junit.Assert.fail(Assert.java:89)
	at org.opensearch.test.OpenSearchIntegTestCase.ensureColor(OpenSearchIntegTestCase.java:1007)
	at org.opensearch.test.OpenSearchIntegTestCase.ensureYellowAndNoInitializingShards(OpenSearchIntegTestCase.java:960)
	at org.opensearch.remotestore.RemoteStoreIT.verifyRestoredData(RemoteStoreIT.java:143)

[1] https://build.ci.opensearch.org/job/gradle-check/10805/

@sachinpkale
Copy link
Member

Let me mute the test until we fix it completely.

@sachinpkale
Copy link
Member

Muted the test as part of #6230

@sachinpkale
Copy link
Member

Currently muted, will be fixed in 2.7.0

@sachinpkale sachinpkale added v2.7.0 and removed v2.6.0 'Issues and PRs related to version v2.6.0' labels Feb 21, 2023
@DarshitChanpura
Copy link
Member

DarshitChanpura commented Apr 14, 2023

Hey @andrross @sachinpkale . Is the bug-fix still on-track for being completed by code-freeze (Apr 17) for v2.7.0?

@DarshitChanpura
Copy link
Member

Closing as this one looks like it was fixed via bunch of PRs: #6252 and then via #6375 and both of these have been backported to 2.x line (#6265 & #6378)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework v2.7.0
Projects
None yet
Development

No branches or pull requests

5 participants