Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] ClusterDisruptionIT: Address already in use #29244

Closed
DaveCTurner opened this issue Mar 26, 2018 · 6 comments
Closed

[CI] ClusterDisruptionIT: Address already in use #29244

DaveCTurner opened this issue Mar 26, 2018 · 6 comments
Assignees
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v6.4.1

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Mar 26, 2018

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+periodic/1703/console failed because it tried to bind to a port that's already in use:

java.lang.RuntimeException: failed to start nodes
	at __randomizedtesting.SeedInfo.seed([1B588786F9BF83:C6650D14FA151162]:0)
	at org.elasticsearch.test.InternalTestCluster.startAndPublishNodesAndClients(InternalTestCluster.java:1374)
	at org.elasticsearch.test.InternalTestCluster.startNode(InternalTestCluster.java:1652)
	at org.elasticsearch.test.InternalTestCluster.startDataOnlyNode(InternalTestCluster.java:1757)
	at org.elasticsearch.test.InternalTestCluster.startDataOnlyNode(InternalTestCluster.java:1753)
	at org.elasticsearch.discovery.ClusterDisruptionIT.testSearchWithRelocationAndSlowClusterStateProcessing(ClusterDisruptionIT.java:366)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: BindTransportException[Failed to bind to [30210]]; nested: BindException[Address already in use (Bind failed)];
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.elasticsearch.test.InternalTestCluster.startAndPublishNodesAndClients(InternalTestCluster.java:1369)
	... 41 more
Caused by: BindTransportException[Failed to bind to [30210]]; nested: BindException[Address already in use (Bind failed)];
	at org.elasticsearch.transport.TcpTransport.bindToPort(TcpTransport.java:790)
	at org.elasticsearch.transport.TcpTransport.bindServer(TcpTransport.java:755)
	at org.elasticsearch.transport.MockTcpTransport.doStart(MockTcpTransport.java:407)
	at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:66)
	at org.elasticsearch.test.transport.MockTransportService$DelegateTransport.start(MockTransportService.java:647)
	at org.elasticsearch.transport.TransportService.doStart(TransportService.java:213)
	at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:66)
	at org.elasticsearch.node.Node.start(Node.java:641)
	at org.elasticsearch.test.InternalTestCluster$NodeAndClient.startNode(InternalTestCluster.java:854)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	... 1 more
Caused by: java.net.BindException: Address already in use (Bind failed)
	at java.net.PlainSocketImpl.socketBind(Native Method)
	at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
	at java.net.ServerSocket.bind(ServerSocket.java:375)
	at java.net.ServerSocket.bind(ServerSocket.java:329)
	at org.elasticsearch.mocksocket.MockServerSocket.access$001(MockServerSocket.java:32)
	at org.elasticsearch.mocksocket.MockServerSocket.lambda$bind$1(MockServerSocket.java:63)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.elasticsearch.mocksocket.MockServerSocket.bind(MockServerSocket.java:62)
	at org.elasticsearch.transport.MockTcpTransport.bind(MockTcpTransport.java:120)
	at org.elasticsearch.transport.MockTcpTransport.bind(MockTcpTransport.java:71)
	at org.elasticsearch.transport.TcpTransport.lambda$bindToPort$16(TcpTransport.java:773)
	at org.elasticsearch.common.transport.PortsRange.iterate(PortsRange.java:59)
	at org.elasticsearch.transport.TcpTransport.bindToPort(TcpTransport.java:771)
	... 14 more

It looks like a fix for this sort of thing went into #9527 but this doesn't seem to be in place any more.

FWIW the reproduction line was:

REPRODUCE WITH: ./gradlew :server:integTest \
  -Dtests.seed=1B588786F9BF83 \
  -Dtests.class=org.elasticsearch.discovery.ClusterDisruptionIT \
  -Dtests.method="testSearchWithRelocationAndSlowClusterStateProcessing" \
  -Dtests.security.manager=true \
  -Dtests.locale=es-NI \
  -Dtests.timezone=Pacific/Midway

I checked the worker and there's nothing binding to that port any more.

@DaveCTurner DaveCTurner added >test-failure Triaged test failures from CI v6.3.0 :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Mar 26, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@ywelsch ywelsch added the :Delivery/Build Build or test infrastructure label Apr 20, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@ywelsch ywelsch removed the :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. label Apr 20, 2018
@ywelsch
Copy link
Contributor

ywelsch commented Apr 20, 2018

I think this fits better under "Test infrastructure" hence the relabeling.

@javanna
Copy link
Member

javanna commented Aug 28, 2018

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Aug 29, 2018
Today we support a static list of seed hosts in core Elasticsearch, and allow a
dynamic list of seed hosts to be provided via a file using the `discovery-file`
plugin. In fact the ability to provide a dynamic list of seed hosts is
increasingly useful in containerized environments, so this change moves this
functionality to core Elasticsearch to avoid the need for a plugin.

For BWC purposes the plugin still exists, but does nothing more than issue a
warning when it is used.

Furthermore, in order to start up nodes in integration tests we currently
assign a known port to each node before startup, which unfortunately sometimes
fails if another process grabs the selected port in the meantime. By moving the
`discovery-file` functionality into the core product we can use it to avoid
this race.

Relates elastic#29244
Closes elastic#33030
DaveCTurner added a commit that referenced this issue Aug 30, 2018
Today we support a static list of seed hosts in core Elasticsearch, and allow a
dynamic list of seed hosts to be provided via a file using the `discovery-file`
plugin. In fact the ability to provide a dynamic list of seed hosts is
increasingly useful, so this change moves this functionality to core
Elasticsearch to avoid the need for a plugin.

Furthermore, in order to start up nodes in integration tests we currently
assign a known port to each node before startup, which unfortunately sometimes
fails if another process grabs the selected port in the meantime. By moving the
`discovery-file` functionality into the core product we can use it to avoid
this race.

This change also moves the expected path to the file from
`$ES_PATH_CONF/discovery-file/unicast_hosts.txt` to
`$ES_PATH_CONF/unicast_hosts.txt`. An example of this file is not included in
distributions.

For BWC purposes the plugin still exists, but does nothing more than create the
example file in the old location, and issue a warning when it is used. We also
continue to support the old location for the file, but warn about its
deprecation.

Relates #29244
Closes #33030
DaveCTurner added a commit that referenced this issue Aug 30, 2018
Today we support a static list of seed hosts in core Elasticsearch, and allow a
dynamic list of seed hosts to be provided via a file using the `discovery-file`
plugin. In fact the ability to provide a dynamic list of seed hosts is
increasingly useful, so this change moves this functionality to core
Elasticsearch to avoid the need for a plugin.

Furthermore, in order to start up nodes in integration tests we currently
assign a known port to each node before startup, which unfortunately sometimes
fails if another process grabs the selected port in the meantime. By moving the
`discovery-file` functionality into the core product we can use it to avoid
this race.

This change also moves the expected path to the file from
`$ES_PATH_CONF/discovery-file/unicast_hosts.txt` to
`$ES_PATH_CONF/unicast_hosts.txt`. An example of this file is not included in
distributions.

For BWC purposes the plugin still exists, but does nothing more than create the
example file in the old location, and issue a warning when it is used. We also
continue to support the old location for the file, but warn about its
deprecation.

Relates #29244
Closes #33030
@DaveCTurner
Copy link
Contributor Author

Closing now that we have a plan of attack in #33675.

@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v6.4.1
Projects
None yet
Development

No branches or pull requests

7 participants