Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.classMethod is flaky #11257

Closed
kasundra07 opened this issue Nov 17, 2023 · 2 comments
Labels
bug Something isn't working untriaged

Comments

@kasundra07
Copy link
Contributor

Describe the bug
org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.classMethod is flaky

java.lang.RuntimeException: file handle leaks: [FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-8.tlog), FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-8.ckp), FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-9.ckp), FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-10.tlog), FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-9.tlog), FileChannel(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT_138297ACFE85FB99-001/tempDir-005/node_t4/nodes/0/indices/RomeNZoISAupBjrumlxDNQ/0/translog/translog-10.ckp)]
	at __randomizedtesting.SeedInfo.seed([138297ACFE85FB99]:0)
	at org.apache.lucene.tests.mockfile.LeakFS.onClose(LeakFS.java:63)
	at org.apache.lucene.tests.mockfile.FilterFileSystem.close(FilterFileSystem.java:69)
	at org.apache.lucene.tests.mockfile.FilterFileSystem.close(FilterFileSystem.java:70)
	at org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:223)
	at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.Exception
	at org.apache.lucene.tests.mockfile.LeakFS.onOpen(LeakFS.java:46)
	at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82)
	at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202)
	at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:171)
	at java.base/java.nio.channels.FileChannel.open(FileChannel.java:309)
	at java.base/java.nio.channels.FileChannel.open(FileChannel.java:369)
	at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:46)
	at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:113)
	at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:158)
	at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:161)
	at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:338)
	at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:310)
	at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:365)
	at org.opensearch.index.translog.InternalTranslogManager.syncTranslog(InternalTranslogManager.java:196)
	at org.opensearch.index.engine.InternalEngine.syncTranslog(InternalEngine.java:610)
	at org.opensearch.index.shard.IndexShard.postActivatePrimaryMode(IndexShard.java:3449)
	at org.opensearch.index.shard.IndexShard.lambda$updateShardState$4(IndexShard.java:727)
	at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4052)
	at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4022)
	at org.opensearch.index.shard.IndexShard.lambda$asyncBlockOperations$37(IndexShard.java:3973)
	at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
	at org.opensearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:157)
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:908)
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	... 1 more

To Reproduce

./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testReplicaAlreadyAtCheckpoint" -Dtests.seed=138297ACFE85FB99

Expected behavior
The test must always pass

Additional context
https://build.ci.opensearch.org/job/gradle-check/30101/testReport/junit/org.opensearch.remotestore/SegmentReplicationUsingRemoteStoreIT/classMethod/

@kasundra07 kasundra07 added bug Something isn't working untriaged labels Nov 17, 2023
@mch2
Copy link
Member

mch2 commented Nov 17, 2023

Looks same as #11255 (comment)

@andrross
Copy link
Member

Thanks @kasundra07. The "classMethod" case shows as a failure due to a leaked file handle, but the cause of the leaked file handle is the test case within this class that failed (and therefore failed to clean up properly). In this case the failed test case is the one detailed in #11255.

Closing as a duplicate of #11255.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

3 participants