Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReplicationTrackerRetentionLeaseTests.testReplicaIgnoresOlderRetentionLeasesVersion fails occasionally #38245

Closed
gwbrown opened this issue Feb 2, 2019 · 5 comments
Assignees
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI

Comments

@gwbrown
Copy link
Contributor

gwbrown commented Feb 2, 2019

I ran into this locally, and it reproduces reliably for me on latest master at time of writing. The test appears to pass most of the time, although this particular line fails reproducibly. There aren't any failures of this test in the build stats tracker, so I'm not going to mute it (yet).

./gradlew :server:unitTest -Dtests.seed=ACF05F449C755A1 -Dtests.class=org.elasticsearch.index.seqno.ReplicationTrackerRetentionLeaseTests -Dtests.method="testReplicaIgnoresOlderRetentionLeasesVersion" -Dtests.security.manager=true -Dtests.locale=en-GI -Dtests.timezone=America/St_Thomas -Dcompiler.java=11 -Druntime.java=11

The failure is:

Expected: an empty collection
   >      but: <[RetentionLease{id='3-0', retainingSequenceNumber=979171579384517195, timestamp=5923124907862339051, source='zUdKvmZW'}, RetentionLease{id='3-1', retainingSequenceNumber=6017346890990993807, timestamp=3929571805510614766, source='OzrFxoRl'}, RetentionLease{id='3-2', retainingSequenceNumber=2725439553447783298, timestamp=3071769291030185046, source='CKtLGpoK'}, RetentionLease{id='3-3', retainingSequenceNumber=8598544668823806214, timestamp=7849418724783540755, source='vGgDkvNA'}]>
   >    at __randomizedtesting.SeedInfo.seed([ACF05F449C755A1:51C70B25254BA317]:0)
   >    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >    at org.elasticsearch.index.seqno.ReplicationTrackerRetentionLeaseTests.testReplicaIgnoresOlderRetentionLeasesVersion(ReplicationTrackerRetentionLeaseTests.java:362)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   >    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   >    at java.base/java.lang.Thread.run(Thread.java:834)
@gwbrown gwbrown added >test-failure Triaged test failures from CI :Distributed/CCR Issues around the Cross Cluster State Replication features labels Feb 2, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn dnhatn self-assigned this Feb 2, 2019
dnhatn added a commit that referenced this issue Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates #37951
Closes #38245
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates elastic#37951
Closes elastic#38245
dnhatn added a commit that referenced this issue Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates #37951
Closes #38245
@cbuescher
Copy link
Member

Still failing on at least 6.x with very similar looking failure:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+matrix-java-periodic/ES_BUILD_JAVA=java11,ES_RUNTIME_JAVA=zulu11,nodes=immutable&&linux&&docker/220/consoleFull

./gradlew :server:unitTest \
  -Dtests.seed=69F7C45D4D0C2BEE \
  -Dtests.class=org.elasticsearch.index.seqno.ReplicationTrackerRetentionLeaseTests \
  -Dtests.method="testAddOrRenewRetentionLease" \
  -Dtests.security.manager=true \
  -Dtests.locale=it-SM \
  -Dtests.timezone=Asia/Srednekolymsk \
  -Dcompiler.java=11 \
  -Druntime.java=11

Reproduces locally, this is on current 6.x head at 3bca1aa which includes 0169965.
@dnhatn Not sure if this build was missing somehting or this is another error?

@cbuescher cbuescher reopened this Feb 3, 2019
@cbuescher
Copy link
Member

Also reproduces on master at 3c1544d with the same above reproduce line. I don't know about the frequency of these failures yet but will mute these tests again just in case.

@cbuescher
Copy link
Member

Sorry, didn't realize this fails in a different method than the original issue ("testAddOrRenewRetentionLease"), the error looks like this:

FAILURE 0.23s | ReplicationTrackerRetentionLeaseTests.testAddOrRenewRetentionLease <<< FAILURES!
   > Throwable #1: java.lang.AssertionError:
   > Expected: <8739982039787194216L>
   >      but: was <6842866809365694223L>
   >    at __randomizedtesting.SeedInfo.seed([69F7C45D4D0C2BEE:1EBB383ADFEC3E5E]:0)
   >    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >    at org.elasticsearch.index.seqno.ReplicationTrackerRetentionLeaseTests.assertRetentionLeases(ReplicationTrackerRetentionLeaseTests.java:382)
   >    at org.elasticsearch.index.seqno.ReplicationTrackerRetentionLeaseTests.testAddOrRenewRetentionLease(ReplicationTrackerRetentionLeaseTests.java:81)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   >    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   >    at java.base/java.lang.Thread.run(Thread.java:834)

If its unrelated I can open a seprate issue.

@cbuescher
Copy link
Member

Muted on master with 6ca7a91, 6.x muting pending (#38276)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants