Skip to content
This repository has been archived by the owner on Oct 11, 2023. It is now read-only.

Growing sshd_config after restarting multiple times prevents SSH access #2581

Closed
dan-osterrath opened this issue Nov 23, 2018 · 7 comments
Closed

Comments

@dan-osterrath
Copy link

RancherOS Version: (ros os version)
seen on:
rancherOS Base-1.1.0.93-dab52c1 (ami-0655e569)
rancherOS Base-1.1.0.94-49689c2 (ami-c50a86aa)

Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.)
AWS

We are starting and stopping our EC2 development instances regularly and very often. It seems that at every startup the following 4 lines will be appended to /etc/ssh/sshd_config. This remained undiscovered more than a year until we suddenly could not SSH into our machines without any modification. After a long investigation we found out that the sshd deamon does not start anymore because the sshd_config file contains the following 4 lines about 280 times:

UseDNS no
PermitRootLogin no
ServerKeyBits 2048
AllowGroups docker

It seems that this patch introduced this behaviour.

So we see 2 problems here:

  1. Why are these 4 lines appended every time to sshd_config? They should be replaced or appending should be skipped.
  2. Why does sshd fail on startup when these parameters are appended too often. There are no error messages at all in /var/log/syslog. We only see this line unless the sshd failed.

sshd[907]: Server listening on 0.0.0.0 port 22.

We also have no threshold for the number of configuration repetitions when sshd fails. One of our instances failed at about 260 repetitions of these 4 lines.

@dan-osterrath
Copy link
Author

We compared the 2 most recent overlays for /etc/ssh/sshd_config. See the sshd_config.diff for the details how the config file has been changed over time.

@dan-osterrath
Copy link
Author

A temporal fix is to detach the volume from the EC2 instance, attach it as secondary volume to another EC2 instance, modify the sshd_config file in the overlay manually (remove duplicate lines) and then reattach the volume to the original EC2 instance.
This of course only works for the next X restarts.

@niusmallnan
Copy link
Contributor

It should not be caused by that PR, it was introduced in 1.3.0, but you are using 1.1.0.

It seems that you are not using the default console.
The default console will be rebuilt every boot, so it will not be set repeatedly.
Other console data is persistent, so these lines are constantly increasing.

It should be a bug, we will fix it.

@gkirchner
Copy link

It seems that you are not using the default console.
The default console will be rebuilt every boot, so it will not be set repeatedly.

We are in fact using the Ubuntu console.

@kordeviant
Copy link

I have this same problem on ubuntu console... could you @Aisuko please just tell us how to fix our existing rancheros or should we install a new one?

@kingsd041
Copy link
Contributor

We will fix this issue in v1.5.1 @kordeviant

@kingsd041
Copy link
Contributor

Fixed this issue in RancherOS v1.5.1-rc1
@dan-osterrath Thank you for your feedback

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants