Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async startup of the SSHD server #19

Merged
merged 4 commits into from
Sep 21, 2017

Conversation

olivergondza
Copy link
Member

  • Fix race condition in SSHD.start().
  • Do not let Jenkins to wait until SSHD is fully up.
    • I have seen launch stuck for minutes while running parallel jenkins-test-harness tests as it was waiting for system entropy to launch a service I do not care for on the first place.
  • Some javadoc fixes while on it.

SEVERE: Failed SSHD.init
java.lang.Error: java.lang.reflect.InvocationTargetException
  at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:110)
  at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:176)
  at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:282)
  at jenkins.model.Jenkins$8.runTask(Jenkins.java:924)
  at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:210)
  at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:106)
  ... 8 more
Caused by: java.lang.IllegalArgumentException: Bad port number: -1
  at org.apache.sshd.SshServer.checkConfig(SshServer.java:282)
  at org.apache.sshd.SshServer.start(SshServer.java:342)
  at org.jenkinsci.main.modules.sshd.SSHD.start(SSHD.java:109)
  at org.jenkinsci.main.modules.sshd.SSHD.init(SSHD.java:146)
  ... 13 more
Copy link
Member

@oleg-nenashev oleg-nenashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using Timer.get() instead of Remoting executor service: https://github.com/jenkinsci/jenkins/blob/7f6f495f866bc234fe5a54ce866e8b214ba693b2/core/src/main/java/jenkins/util/Timer.java . It seams to be designed specifically for such background tasks

@@ -200,7 +207,15 @@ public static SSHD get() {

@Initializer(after= InitMilestone.JOB_LOADED,fatal=false)
public static void init() throws IOException, InterruptedException {
get().start();
MasterComputer.threadPoolForRemoting.submit(new Runnable() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log something like "Scheduling SSHD startup" on the FINE level?

MasterComputer.threadPoolForRemoting.submit(new Runnable() {
@Override public void run() {
try {
get().start();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some logging before and after may be also useful for diagnostic purposes

@@ -73,7 +74,7 @@ public int getPort() {
/**
* Gets the current TCP/IP port that this daemon is running with.
*
* @return -1 if disabled, but never null.
* @return Actual port number or -1 if disabled.
*/
public synchronized int getActualPort() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 2 have: @CheckForSigned to enforce checks

@@ -64,7 +65,7 @@ public SSHD() {
* Returns the configured port to run SSHD.
*
* @return
* -1 to disable this, 0 to run with a random port, otherwise the port number.
* -1 if disabled, 0 if random port is selected, otherwise the port number configured.
*/
public int getPort() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 2 have: @CheckForSigned to enforce checks

Copy link
Member

@oleg-nenashev oleg-nenashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks @olivergondza

@oleg-nenashev
Copy link
Member

oleg-nenashev commented Sep 14, 2017

@jenkinsci/code-reviewers some reviews of edge cases would be appreciated.

The only risk I see is that SSH-dependent JenkinsRule-based tests and Configuration Management logic (e.g. SSH CLI) will need to wait and check the SSH status before they will be able to really connect to the instance. Looks fine for SSH clients, which should poll the status anyway

@olivergondza
Copy link
Member Author

Given my main motivation here are the tests and if this turns out to cause some problems for the tests, we may want to rethink the startup logic for jenkins-test-harness so it will not start the server at all unless someone call SSHD.get() as a signal that the test do care. I presume 99+% of the tests does not...

@oleg-nenashev
Copy link
Member

Regarding JTH, I agree.
The most of the tests should not wait for SSHD startup, and actually SSHD should be disabled in them at all if not required. Maybe JTH could just disable SSHD when running Jenkins and then tests can enable it by setting the SSHD port when it's required

@olivergondza
Copy link
Member Author

I guess we will have to try running the core tests against this to see if it would cause problems before merging this.

@olivergondza
Copy link
Member Author

@oleg-nenashev, no further reviews, mind if I release this and start using it as a test dependency so we start receiving feedback?

@oleg-nenashev
Copy link
Member

@olivergondza fine for me

@oleg-nenashev oleg-nenashev merged commit 32ea91a into jenkinsci:master Sep 21, 2017
@jglick
Copy link
Member

jglick commented Oct 5, 2017

Sorry, just saw this and it looks potentially wrong.

The port was already defaulted to -1 (as of #12 in sshd-1.11 ~ jenkinsci/jenkins#2795 in 2.46.2 / 2.54), so start() should have returned immediately. Only tests which actually need SSHD should be doing

SSHD.get().setPort(0);

So what tests were you seeing stuck @olivergondza? Perhaps you were using an older Jenkins baseline and simply needed to update?

@jglick
Copy link
Member

jglick commented Oct 5, 2017

IOW this change was not necessary for test performance; and may be introducing a race condition in production environments: you would expect that after Jenkins claims to be “up and running” that you could connect to a configured SSH service, for example via CLI, but now there is a window where you may not be able to.

jglick added a commit to jglick/sshd-module that referenced this pull request Oct 5, 2017
@olivergondza
Copy link
Member Author

So what tests were you seeing stuck @olivergondza? Perhaps you were using an older Jenkins baseline and simply needed to update?

You are right this was resolved by updating sshd to deploy your fix. I experienced all the tests to be stuck for tens of seconds waiting for Jenkins/SSHD to come up.

@olivergondza olivergondza deleted the async-startup branch October 6, 2017 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants