Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rawhide: upgrades fail with new openssh-9.0p1-10.fc38.1 #1394

Closed
dustymabe opened this issue Jan 27, 2023 · 31 comments
Closed

rawhide: upgrades fail with new openssh-9.0p1-10.fc38.1 #1394

dustymabe opened this issue Jan 27, 2023 · 31 comments

Comments

@dustymabe
Copy link
Member

The upgrade tests started failing in 38.20230126.91.0. One of the transitions there is:

openssh 9.0p1-9.fc38 -> 9.0p1-10.fc38.1

If you recreate the upgrade test yourself and have serial console to a machine:

cosa run -c --qemu-image tmp/kola-qemu-cache/fedora-coreos-38.20230124.91.0-qemu.x86_64.qcow2

then (to rebase to a local build):

sudo rpm-ostree rebase --experimental ostree-unverified-image:oci-archive:/var/mnt/workdir/builds/38.20230127.dev.0/x86_64/fedora-coreos-38.20230127.dev.0-ostree.x86_64.ociarchive

then after reboot from the serial console I can see the sshd service is failed and why:

$ sudo systemctl status sshd.service
● sshd.service - OpenSSH server daemon
     Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Fri 2023-01-27 17:42:50 UTC; 37s ago
       Docs: man:sshd(8)
             man:sshd_config(5)
    Process: 2116 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=1/FAILURE)
   Main PID: 2116 (code=exited, status=1/FAILURE)
        CPU: 8ms
[core@cosa-devsh ~]$ journalctl --since='1 minutes ago' -u sshd
Jan 27 17:43:32 cosa-devsh systemd[1]: sshd.service: Scheduled restart job, restart counter is at 13.
Jan 27 17:43:32 cosa-devsh systemd[1]: Stopped sshd.service - OpenSSH server daemon.
Jan 27 17:43:32 cosa-devsh systemd[1]: Starting sshd.service - OpenSSH server daemon...
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: @         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: Permissions 0640 for '/etc/ssh/ssh_host_rsa_key' are too open.
Jan 27 17:43:32 cosa-devsh sshd[2197]: It is required that your private key files are NOT accessible by others.
Jan 27 17:43:32 cosa-devsh sshd[2197]: This private key will be ignored.
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: @         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: Permissions 0640 for '/etc/ssh/ssh_host_ecdsa_key' are too open.
Jan 27 17:43:32 cosa-devsh sshd[2197]: It is required that your private key files are NOT accessible by others.
Jan 27 17:43:32 cosa-devsh sshd[2197]: This private key will be ignored.
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: @         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
Jan 27 17:43:32 cosa-devsh sshd[2197]: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Jan 27 17:43:32 cosa-devsh sshd[2197]: Permissions 0640 for '/etc/ssh/ssh_host_ed25519_key' are too open.
Jan 27 17:43:32 cosa-devsh sshd[2197]: It is required that your private key files are NOT accessible by others.
Jan 27 17:43:32 cosa-devsh sshd[2197]: This private key will be ignored.
Jan 27 17:43:32 cosa-devsh sshd[2197]: sshd: no hostkeys available -- exiting.
Jan 27 17:43:32 cosa-devsh systemd[1]: sshd.service: Main process exited, code=exited, status=1/FAILURE
Jan 27 17:43:32 cosa-devsh systemd[1]: sshd.service: Failed with result 'exit-code'.
Jan 27 17:43:32 cosa-devsh systemd[1]: Failed to start sshd.service - OpenSSH server daemon.

Which means that existing systems that are upgraded will need their host keys migrated to adhere to the new stricter permissions requirements.

In the distgit commit for this change there is an RPM scriptlet to migrate the existing host keys to conforming permissions.

We need to figure out how to handle this on OSTree systems.

@cgwalters
Copy link
Member

Instead of a %post that code should be run as a systemd unit.

@travier
Copy link
Member

travier commented Jan 27, 2023

@travier
Copy link
Member

travier commented Jan 30, 2023

Pushed a draft PR in https://src.fedoraproject.org/rpms/openssh/pull-request/39

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Feb 1, 2023
The issue described in coreos/fedora-coreos-tracker#1394
has been fixed upstream in fedora-selinux/selinux-policy#1574
and landed in `selinux-policy-38.6-1.fc38`.
@travier
Copy link
Member

travier commented Feb 22, 2023

I've updated my PR from the discussion at the meeting (same link).

@dustymabe
Copy link
Member Author

We discussed this during the community meeting today.

We were unable to convince the maintainer to continue support for relaxed file permissions on host keys, but we did find out how to discover the in use hostkeys (as mentioned in #1394 (comment) @travier updated https://src.fedoraproject.org/rpms/openssh/pull-request/39 to use the new method). We also discussed ways we could roll this out to FCOS users. @travier @jlebon and I will get together tomorrow to flesh out the strategy there.

@dustymabe
Copy link
Member Author

opened https://bugzilla.redhat.com/show_bug.cgi?id=2172956 so we can submit this as a freeze exception for Fedora 38 beta.

@dustymabe
Copy link
Member Author

@travier @jlebon and I discussed this last week. The two options we discussed were:

  1. We roll out the changes early in Fedora 37, including an external execStartPre script for sshd that would detect non-conforming permissions and sleep for a period of time. The benefit of doing this is that after the sleep the user can still log in to the system. The negative of doing this is that it's more complex (and more work) and we could also somehow inject error if we don't test comprehensively enough.
  2. We go with the migration shipped by the RPM (we're working on an update to it to make it more palatable) and give users plenty of warning with a coreos-status post.

I believe we are leaning towards 2. since we are already spread a bit thin and it's less complex, though it does offer opportunity for people to get locked out of their systems if the script errors.

@dustymabe dustymabe added the meeting topics for meetings label Mar 1, 2023
@jlebon
Copy link
Member

jlebon commented Mar 1, 2023

We discussed this in today's community meeting:

  • AGREED: we will go with option 2, where we will rely on an updated migration service shipped by sshd that works on OSTree-based systems and give a heads up on coreos-status (jlebon, 16:58:26)

@dustymabe dustymabe added fallout/f38 and removed meeting topics for meetings labels Mar 7, 2023
@dustymabe
Copy link
Member Author

New updates landed in bodhi with the fix:

  • openssh-9.0p1-12.fc38.1
  • fedora-release-38-0.30

These were fast-tracked in coreos/fedora-coreos-config@d3b0047

This issue never made it into a released production stream of Fedora CoreOS so we can close it now.

@travier
Copy link
Member

travier commented Mar 8, 2023

@dustymabe Did you test this script on a system with a non en_US locale? I think this is very likely to break.

@travier
Copy link
Member

travier commented Mar 8, 2023

The chmod/chonw calls in https://src.fedoraproject.org/rpms/openssh/pull-request/40#_3__35 are missing quotes. But agree it's unlikely to happen.

@dustymabe
Copy link
Member Author

@dustymabe Did you test this script on a system with a non en_US locale? I think this is very likely to break.

sigh.. of course I didn't. I liked the sshd -T | grep "^hostkey " solution but that didn't work.

@dustymabe
Copy link
Member Author

@jlebon
Copy link
Member

jlebon commented Mar 8, 2023

I think just calling it as LANG=C sshd -T should work.

@dustymabe
Copy link
Member Author

I think just calling it as LANG=C sshd -T should work.

I'll PR that change if anyone knows the answer to #1394 (comment)

@bgilbert
Copy link
Contributor

bgilbert commented Mar 8, 2023

@dustymabe Not sure I understand the question. The idea behind LANG=C is that we'll be safe as long as the format string itself doesn't change. We could have a kola test that tries setting bad permissions on a key, though.

@bgilbert
Copy link
Contributor

bgilbert commented Mar 8, 2023

Looking at the openssh-portable codebase, I'm not actually sure OpenSSH is internationalized at all.

@jlebon
Copy link
Member

jlebon commented Mar 8, 2023

Indeed. I see a few gettext hits in the openssh repo, but it's not prevalent. Scanning the openssh-server RPM, I don't see files in /usr/share/locale/. Doing strings on the binary doesn't show any evidence either.

@travier
Copy link
Member

travier commented Mar 8, 2023

OK, I hope we get away with this as it makes me very uncomfortable to parse error output as root and do chmod/chown changes like that. Maybe we should remove those files completely once we've made a barrier release.

@dustymabe
Copy link
Member Author

Maybe we should remove those files completely once we've made a barrier release.

We're kind of bound by what the openssh RPM does here. It will need to be supported for a long time since Fedora itself doesn't do barrier releases.

@bgilbert
Copy link
Contributor

bgilbert commented Mar 8, 2023

Presumably we'll need to continue supporting custom configs with mode 640 host keys indefinitely. I've added a regression test for that part in coreos/fedora-coreos-config#2285. The test also implicitly exercises the upgrade path, but that isn't an explicit goal, since I wanted to make the test independent of the particular mitigation technique (or of whether any mitigation is needed at all).

@dustymabe
Copy link
Member Author

Presumably we'll need to continue supporting custom configs with mode 640 host keys indefinitely.

I'm not sure we can 100%. We can support it if the user wrote it out via Ignition (like your test does), but if they configure it some other way (config management of some sort) after the ssh-host-keys-migration.service has run then they are out of luck.

@bgilbert
Copy link
Contributor

bgilbert commented Mar 8, 2023

Right, I meant via Ignition. Config management is out of scope for FCOS.

@travier
Copy link
Member

travier commented Mar 9, 2023

Maybe we should remove those files completely once we've made a barrier release.

We're kind of bound by what the openssh RPM does here. It will need to be supported for a long time since Fedora itself doesn't do barrier releases.

Fedora will have to keep it until Fedora 39 (2 releases) and can drop it in Fedora 40 as they support skipping a release.

We have barrier releases so we could remove all of this earlier.

@jlebon
Copy link
Member

jlebon commented Mar 9, 2023

We have barrier releases so we could remove all of this earlier.

Because of #1394 (comment), don't we have the opposite problem? Once the maintainers decide to drop it, we'll need to decide whether to keep carrying the mitigation or not. Ideally we can just give a heads up to users and let it drop out.

@travier
Copy link
Member

travier commented Mar 9, 2023

OK, if users created host keys with 640 mode then indeed we would have to keep supporting that or their config would break on a newer FCOS.

But I'd say that's reasonable as it would be new systems, not existing systems that would break, so announcing the change and waiting should be good enough.

bgilbert added a commit to coreos/fedora-coreos-config that referenced this issue Mar 9, 2023
Exclude the embedded SSH private key from Red Hat secret scanning.

For coreos/fedora-coreos-tracker#1394.
@jlebon
Copy link
Member

jlebon commented Mar 10, 2023

coreos-status email: https://hackmd.io/aw7c1xNLRNSxcomW63Carw

@dustymabe
Copy link
Member Author

Looks like we've had one report of translations messing us up on this in the wild:

Permissoin 0640 for `/etc/ssh/ssh_host_rsa_key` are too open.

@dustymabe
Copy link
Member Author

Nevermind. I think it was a typo from the user.

@travier
Copy link
Member

travier commented Mar 14, 2023

Updates in https://src.fedoraproject.org/rpms/fedora-release/pull-request/252. The change only works for FCOS right now.

@travier
Copy link
Member

travier commented Mar 28, 2023

https://bodhi.fedoraproject.org/updates/FEDORA-2023-bb12d0efe6 > This would be good to test via a fast-track in FCOS CI.

HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Exclude the embedded SSH private key from Red Hat secret scanning.

For coreos/fedora-coreos-tracker#1394.
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Exclude the embedded SSH private key from Red Hat secret scanning.

For coreos/fedora-coreos-tracker#1394.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants