Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] MasterDisruptionIT.testFailWithMinimumMasterNodesConfigured #37699

Closed
costin opened this issue Jan 22, 2019 · 2 comments
Closed

[CI] MasterDisruptionIT.testFailWithMinimumMasterNodesConfigured #37699

costin opened this issue Jan 22, 2019 · 2 comments
Assignees
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >test-failure Triaged test failures from CI

Comments

@costin
Copy link
Member

costin commented Jan 22, 2019

Not sure if the Lucene upgrade affected the test suite but there are two failures in
MasterDisruptionIT.testFailWithMinimumMasterNodesConfigured which looks important enough to not mute it as it deserves immediate attention.

Expected: "node_t2"
     but: was "node_t1"
	at __randomizedtesting.SeedInfo.seed([4E701D1D163A434D:1CB79551CE20161A]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.elasticsearch.discovery.AbstractDisruptionTestCase.lambda$assertMaster$2(AbstractDisruptionTestCase.java:202)

See
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/1420/console
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=debian/197/console

@costin costin added >test-failure Triaged test failures from CI :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Jan 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@costin
Copy link
Member Author

costin commented Jan 22, 2019

@original-brownbear original-brownbear added :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. labels Jan 25, 2019
@DaveCTurner DaveCTurner self-assigned this Jan 31, 2019
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Jan 31, 2019
Today we use `AbstractDisruptionTestCase` to test the behaviour of things like
master elections in the presence of cluster disruptions. These tests have
rather enthusiastic fault detection settings, detecting a fault if a single
ping fails, with a one-second timeout. Furthermore there are some tests that
assert the identity of the master remains unchanged during some disruption, and
these assertions fail rather often thanks to the overly sensitive fault
detector.

However in a number of these tests the fault detector need not be this
sensitive. This commit moves some such tests into their own test suite and uses
more sensible fault-detection settings to avoid the kind of master instability
that is causing CI failures.

Closes elastic#37699
DaveCTurner added a commit that referenced this issue Feb 1, 2019
Today we use `AbstractDisruptionTestCase` to test the behaviour of things like
master elections in the presence of cluster disruptions. These tests have
rather enthusiastic fault detection settings, detecting a fault if a single
ping fails, with a one-second timeout. Furthermore there are some tests that
assert the identity of the master remains unchanged during some disruption, and
these assertions fail rather often thanks to the overly sensitive fault
detector.

However in a number of these tests the fault detector need not be this
sensitive. This commit moves some such tests into their own test suite and uses
more sensible fault-detection settings to avoid the kind of master instability
that is causing CI failures.

Closes #37699
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants