-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
postprocess: run systemctl preset-all #77
Conversation
Still doing some testing of this, having a weird issue with local cosa builds that is apparently the temp webserver dying during |
I verified that the system looks fine after this (haven't yet verified that it fixes our upgrade issue though). However, I also notice that |
@jlebon and I had a quick chat about this, I'm going to try updating it to run |
c591bb3
to
6d5b897
Compare
OK updated, and here's the diff from the result:
And yep, doing the preset-all helps ensure we're fully consistent. |
CL works this way now: default-enabled units are enabled via symlinks in So, on CL, the only way users can disable a default-enabled unit is to mask it. That works, but is an abuse of masks and tends to confuse users. The decision to ship enablement links in So let's please not do this. Presets are 100% the way to go. Can rpm-ostree be taught not to mess with enablement links in |
The CL issue is coreos/bugs#2178 and the upstream issue is systemd/systemd#4830. |
I think that's a systemd bug.
What's difficult to change?
We are using presets I'd say.
Ugh...that'd be a huge special case in the currently generic Moreover, dropping all this stuff in
|
Hmm, though if we do
Right? This also lets us transition later on from defaults in The downside though is that there's no way for admins to say "definitely always have this unit enabled, even if a later update disables it", though that's still better UX-wise than |
More of a missing feature. The systemd maintainers had not originally considered allowing
Masks are for saying that you don't want a service to be started, even if another service pulls it in as a dependency. They're stronger than ordinary disablement, and aren't really for casual use.
IIRC the main issues were:
There's also the pain of testing the upgrade path, since a botched conversion will badly brick a system.
If we're precalculating the preset and packaging the resulting symlinks (in either
Yep, that's a feature. For example, on CL we changed the default from ntpd to systemd-timesyncd. Users who configured a custom (Changing the default is still a breaking change for new launches, of course. We'd need to provide lots of advance notice. But breaking the installed base is worse.) Fixing systemd/systemd#4830 wouldn't actually address this problem. It would allow the system to encode the distinction between default-{en,dis}abled and manually-{en,dis}abled services, but we can't reasonably ask users to use that knob. How would we document it? "When adding custom configuration to a service, always enable that service using Ignition, even if it's enabled by default, so updates don't break your machine in the future." |
I think you're really arguing against changing services that could have reasonably been configured by the user at all. Not totally buying into the "we're just breaking new installs". (This discussion gets into release cycles, epochs etc.)
One side note, rpm-ostree supports this today, if you
We could do that...it would be basically reverting #73 and going the other way? |
It's not just configuration, of course; it's behavior the user could be relying on. If a service is really only an implementation detail... then it should be pulled in as a dependency of something else, and there's no explicit enablement to worry about here. But if the service is explicitly enabled, I'm assuming it's probably relevant to the user.
Do you expect users to peform that operation? |
Hmm, OK how about:
? Then,
OK, so here's a hypothetical: let's imagine this was FCOS, at some point eventually we want to be able to drop I agree that we should be very careful with any change in the default set of enabled services. But I'm not sure if that implies we shouldn't have that ability at all as maintainers. |
BTW remember ostree has a separate
I think so. |
Sure. But we'd very likely want to change the default for new installs before switching existing nodes. The latter might not happen until we stop shipping the old daemon. I'm not thrilled with the 3-way merge approach here, but it'd probably be okay as a starting point, and the infrastructure is already set up to work in those terms. I guess we can work around it with custom upgrade logic if needed.
Sure. At the least, let's add a kola test that checks default unit enablement against a hardcoded list.
Right, fair enough. |
In some ways I wonder if presets really are the right way for immutable OSs to ship defaults. What advantage do we have over shipping the enablement links in |
Yeah, I agree they're not intended for that.
On CL, my answer would have been that they're a more natural way to express unit enablement than systemd-tmpfiles. On FCOS, where
Yeah, that seems to be the logical conclusion. |
@cgwalters Want to update this patch to just add a |
6d5b897
to
57b4663
Compare
Sure, done, though haven't tested it yet. One issue is this is going to be a point of divergence between RHCOS and FCOS. |
Hmm, offhand it seems we should be able to switch to this in RHCOS too, right? Enablements shouldn't change on update going from the previous version of this patch to this latest version. |
Yes, though we need to think about upgrades from the case of dropping the units from |
Yup, that's what I meant by "previous version of this patch" :) The only thing I can think of is if any preset changed from enabled to disabled since the switch to |
This is a followup to coreos#73 This way we're sure that the enabled units are just things that have been preset.
57b4663
to
bc8c231
Compare
Here's a diff of the build with this patch; I notice this also fixes the dbus issues mentioned here. Basically we have the usual "things that change every build"; the unit file changes look expected to me:
|
Um...like including the versions in
I see what you're arguing though that the But I still keep circling back to: it seems cleanest if systemd supported more fully having things enabled statically in Oh actually, a reason not to delete the presets files is that it would break package layering anything that had preset file config. |
OK, so going over the final diff between before #73 and after this patch with a fine tooth comb, I see there are only three files which we lose:
The first one is a symlink to This was four years ago though. And testing now, I don't actually see systemd complaining about this anymore (and Anyway, this LGTM. Will give others some time to do a final review before merging. |
I wonder if we should try to get them to drop it from the rpm? |
Here goes nothing: https://src.fedoraproject.org/rpms/timedatex/pull-request/1 |
(I renamed this PR for posterity to reflect the final approach we went with.) |
A lot of backstory on this, but essentially right now, we always bake a run of `systemctl preset-all` into the OSTree because upgrading hosts rely on these links for service enablement. In hindsight, we should've just stuck with pure systemd preset only as canonicaly from the get go, though it's a bit difficult now to transition from one to the other without breaking things. (Though I'll note not impossible, since we do have update barriers which could allow us to e.g. run a script to restore lost symlinks). For now though, let's at least fix the ability to disable services, which is a pretty big gap in our Ignition configuration story right now. Related: systemd/systemd#15205 Related: coreos#77 Closes: coreos/fedora-coreos-tracker#392
Commit 059e64f generalized the rpm-ostree count me unit activation using the systemd preset configuration file. This previous change incorrectly assumed that all rpm-ostree based variants where using the presets as source of truth work enabled services, but this is right now only true for Fedora CoreOS. Work is in progress to update Silverblue/Kinoite and IoT to preset-all by default. However this change will most likely be F36+ only. Thus this change statically enables the timer unit for IoT, Silverblue & Kinoite. Users can still opt out of counting as before by masking the unit and this will not re-enable it for those that opted-out earlier. See: - https://pagure.io/workstation-ostree-config/pull-request/246 - https://github.com/coreos/fedora-coreos-config/blob/testing-devel/manifests/ignition-and-ostree.yaml#L39..L48 - coreos/fedora-coreos-config#73 - coreos/fedora-coreos-config#77 - coreos/fedora-coreos-config#122 - coreos/rpm-ostree#1803 Fixes: 059e64f 90-default.preset: Enable rpm-ostree count me by default
A compile time option is added to select behaviour: by default UNIT_FILE_PRESET_ENABLE_ONLY is still used, but the intent is to change to UNIT_FILE_PRESET_FULL at some point in the future. Distros that want to opt-in can use the config option to change the behaviour. (The option is just a boolean: it would be possible to make it multi-valued, and allow full, enable-only, disable-only, none. But so far nobody has asked for this, and it's better not to complicate things needlessly.) With the configuration option flipped, instead of only doing enablements, perform a full preset on first boot. The reason is that although `/etc/machine-id` might be missing, there may be other files provisioned in `/etc` (in fact, this use case is mentioned in `log_execution_mode`). Some of those possible files include enablement symlinks even if presets dictate it should be disabled. Such a seemingly contradictory situation occurs in {RHEL,Fedora} CoreOS, where we ship `/etc` as if `preset-all` were called. However, we want to allow users to disable default-enabled services via Ignition, which does this by creating preset dropins before switchroot. (For why we do `preset-all` at compose time, see: coreos/fedora-coreos-config#77). For example, the composed FCOS image has a `enable zincati.service` preset and an enablement for that in `/etc`, while at boot time when we switch root, there may be a `disable zincati.service` preset with higher precedence. In that case, we want systemd to disable the service. This is essentially a revert of 304b307. It seems like systemd *used* to do this, but it was changed to try to make the container workflow a bit faster. Resolves: coreos/fedora-coreos-tracker#392 Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
A compile time option is added to select behaviour: by default UNIT_FILE_PRESET_ENABLE_ONLY is still used, but the intent is to change to UNIT_FILE_PRESET_FULL at some point in the future. Distros that want to opt-in can use the config option to change the behaviour. (The option is just a boolean: it would be possible to make it multi-valued, and allow full, enable-only, disable-only, none. But so far nobody has asked for this, and it's better not to complicate things needlessly.) With the configuration option flipped, instead of only doing enablements, perform a full preset on first boot. The reason is that although `/etc/machine-id` might be missing, there may be other files provisioned in `/etc` (in fact, this use case is mentioned in `log_execution_mode`). Some of those possible files include enablement symlinks even if presets dictate it should be disabled. Such a seemingly contradictory situation occurs in {RHEL,Fedora} CoreOS, where we ship `/etc` as if `preset-all` were called. However, we want to allow users to disable default-enabled services via Ignition, which does this by creating preset dropins before switchroot. (For why we do `preset-all` at compose time, see: coreos/fedora-coreos-config#77). For example, the composed FCOS image has a `enable zincati.service` preset and an enablement for that in `/etc`, while at boot time when we switch root, there may be a `disable zincati.service` preset with higher precedence. In that case, we want systemd to disable the service. This is essentially a revert of 304b307. It seems like systemd *used* to do this, but it was changed to try to make the container workflow a bit faster. Resolves: coreos/fedora-coreos-tracker#392 Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
A compile time option is added to select behaviour: by default UNIT_FILE_PRESET_ENABLE_ONLY is still used, but the intent is to change to UNIT_FILE_PRESET_FULL at some point in the future. Distros that want to opt-in can use the config option to change the behaviour. (The option is just a boolean: it would be possible to make it multi-valued, and allow full, enable-only, disable-only, none. But so far nobody has asked for this, and it's better not to complicate things needlessly.) With the configuration option flipped, instead of only doing enablements, perform a full preset on first boot. The reason is that although `/etc/machine-id` might be missing, there may be other files provisioned in `/etc` (in fact, this use case is mentioned in `log_execution_mode`). Some of those possible files include enablement symlinks even if presets dictate it should be disabled. Such a seemingly contradictory situation occurs in {RHEL,Fedora} CoreOS, where we ship `/etc` as if `preset-all` were called. However, we want to allow users to disable default-enabled services via Ignition, which does this by creating preset dropins before switchroot. (For why we do `preset-all` at compose time, see: coreos/fedora-coreos-config#77). For example, the composed FCOS image has a `enable zincati.service` preset and an enablement for that in `/etc`, while at boot time when we switch root, there may be a `disable zincati.service` preset with higher precedence. In that case, we want systemd to disable the service. This is essentially a revert of 304b307. It seems like systemd *used* to do this, but it was changed to try to make the container workflow a bit faster. Resolves: coreos/fedora-coreos-tracker#392 Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> (cherry picked from commit 9365158)
A compile time option is added to select behaviour: by default UNIT_FILE_PRESET_ENABLE_ONLY is still used, but the intent is to change to UNIT_FILE_PRESET_FULL at some point in the future. Distros that want to opt-in can use the config option to change the behaviour. (The option is just a boolean: it would be possible to make it multi-valued, and allow full, enable-only, disable-only, none. But so far nobody has asked for this, and it's better not to complicate things needlessly.) With the configuration option flipped, instead of only doing enablements, perform a full preset on first boot. The reason is that although `/etc/machine-id` might be missing, there may be other files provisioned in `/etc` (in fact, this use case is mentioned in `log_execution_mode`). Some of those possible files include enablement symlinks even if presets dictate it should be disabled. Such a seemingly contradictory situation occurs in {RHEL,Fedora} CoreOS, where we ship `/etc` as if `preset-all` were called. However, we want to allow users to disable default-enabled services via Ignition, which does this by creating preset dropins before switchroot. (For why we do `preset-all` at compose time, see: coreos/fedora-coreos-config#77). For example, the composed FCOS image has a `enable zincati.service` preset and an enablement for that in `/etc`, while at boot time when we switch root, there may be a `disable zincati.service` preset with higher precedence. In that case, we want systemd to disable the service. This is essentially a revert of 304b307. It seems like systemd *used* to do this, but it was changed to try to make the container workflow a bit faster. Resolves: coreos/fedora-coreos-tracker#392 Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
A compile time option is added to select behaviour: by default UNIT_FILE_PRESET_ENABLE_ONLY is still used, but the intent is to change to UNIT_FILE_PRESET_FULL at some point in the future. Distros that want to opt-in can use the config option to change the behaviour. (The option is just a boolean: it would be possible to make it multi-valued, and allow full, enable-only, disable-only, none. But so far nobody has asked for this, and it's better not to complicate things needlessly.) With the configuration option flipped, instead of only doing enablements, perform a full preset on first boot. The reason is that although `/etc/machine-id` might be missing, there may be other files provisioned in `/etc` (in fact, this use case is mentioned in `log_execution_mode`). Some of those possible files include enablement symlinks even if presets dictate it should be disabled. Such a seemingly contradictory situation occurs in {RHEL,Fedora} CoreOS, where we ship `/etc` as if `preset-all` were called. However, we want to allow users to disable default-enabled services via Ignition, which does this by creating preset dropins before switchroot. (For why we do `preset-all` at compose time, see: coreos/fedora-coreos-config#77). For example, the composed FCOS image has a `enable zincati.service` preset and an enablement for that in `/etc`, while at boot time when we switch root, there may be a `disable zincati.service` preset with higher precedence. In that case, we want systemd to disable the service. This is essentially a revert of 304b307. It seems like systemd *used* to do this, but it was changed to try to make the container workflow a bit faster. Resolves: coreos/fedora-coreos-tracker#392 Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
This is a followup to #73
The problem we hit with RHCOS is that becomes a dangerous upgrade hazard.
Since we previously shipped with the unit files default enabled in
/etc
(i.e. in ostree, the defaults are in
/usr/etc
), the problem comeswith the combination of the fact that during
ConditionFirstBoot
we'll do
systemctl preset-all
which will write those files, butthey already existed.
Then if we try to upgrade to a tree without them, ostree will notice
the files were deleted in the new defaults and apparently not modified...
and then no
NetworkManager.service
anymore.Really we want to have these defaults in
/usr
. We want e.g.NetworkManager.service
to be defined as defaulted to on by us,not in
/etc
.I couldn't see a way to do this directly via the systemd CLI tools but the two-line
shell implementation is pretty trivial, just need to do a
mkdir
pass, then move the links.
I really wish rpm-ostree had done this from the very start, it seems
obvious in retrospect.