Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🆕⌚️ Automatic updates #247

Closed
cgwalters opened this issue Mar 23, 2016 · 46 comments
Closed

🆕⌚️ Automatic updates #247

cgwalters opened this issue Mar 23, 2016 · 46 comments

Comments

@cgwalters
Copy link
Member

cgwalters commented Mar 23, 2016

EDIT 20181206:

Today with rpm-ostree if you want to enable automatic background updates, edit /etc/rpm-ostreed.conf, and ensure that the Daemon section looks like:

[Daemon]
AutomaticUpdatePolicy=stage
#IdleExitTimeout=60

Next then, systemctl enable rpm-ostree-automatic.timer.

This won't automatically reboot though.

This thread though contains a lot of background information/design around higher level issues.


Initial PR: #1147

@cgwalters
Copy link
Member Author

If we do have hands-off upgrades that's going to drive a more immediate need for automated rollbacks. That's #177

@cgwalters
Copy link
Member Author

I am now thinking the default model for automatic updates should involve automatic downloading/queuing. Having to download just the rpmdb to display diffs sucks for multiple reasons. Among them it's going to be hard to support if we move to OCI images. Plus I'd like to support a "deltas only" ostree repo mode. Or a combination.

Beyond that, for the majority of cases such as standalone desktop, enterprise desktop, enterprise server this is what I think is a good default. Enterprise particularly if we encourage local mirroring. One case where people may not want this is standalone embedded systems, but we can obviously support the status quo of typing rpm-ostree upgrade. This is more about defaults and UI workflow of the tools.

@cgwalters
Copy link
Member Author

So specifically for Cockpit, I'd like to move them away from the GetCached* DBus API towards a UI that's oriented around controlling automatic updates.

@dustymabe
Copy link
Member

Having to download just the rpmdb to display diffs sucks

So.. I have an idea for this (probably not a very good one). See #558 where in the 2nd paragraph I say

I think we can achieve this goal if we add, in a predictable format, the list of rpms in that commit to the commit log message.

That way we don't need the rpmdb to do a diff. Also we can choose to only use the rpm data from the commit message if the rpmdb doesn't exist locally, i.e. we only have metadata about the commit. WDYT?

@cgwalters
Copy link
Member Author

Yeah, I think we can put at least the NEVRAs in the commit header.

# rpm -qa|xz | wc -c
4180

which isn't too bad.

@cgwalters
Copy link
Member Author

I think this also blocks on ostreedev/ostree#545

@jlebon
Copy link
Member

jlebon commented Dec 5, 2017

Had a chat with @cgwalters about this today. Here are the notes from that:

High-level expectations:

  1. rpm-ostree status should indicate:

    • if auto-update is completely off, then a line to that effect
    • if no updates are present, then the last time updates were successfully checked for; this is important to ensure users are aware of any e.g. timer/networking etc... issues that may give them a false sense of security
    • if an update is present, then what the pending version/csum and pkgs are, and importantly whether there are any security updates. Be able to provide a diff with e.g. -v (bottom has mock-up outputs).
  2. Users can choose between different levels of automation. Possible levels to consider:
    a) [none] (current)
    b) [check] (download the minimal amount of ostree/rpmmd metadata to know that there is an update and describe it)
    - This would be a good default to ship with
    walters: Two check phases: Check just md freshness, versus download full md? Or maybe too hard.
    c) [download] (download the full ostree/new packages)
    d) [deploy] (deploy but don't reboot)
    - This of course would be blocked on
    Have upgrade only do /etc merge just before reboot #40
    Support commit, syncfs, /etc merge just before reboot ostreedev/ostree#545
    e) [reboot] (deploy and reboot)

I feel like between all of these steps, at least for the desktop we need to think about having gnome-software be in control of triggers. Similarly for server side, Ansible control for blue/green.

Implementation:

  1. include rpmdb pkglist in commit metadata during compose
    • for jigdo, should we split the jigdo RPM into a thinner commit metadata only one and a fatter content one? this could also help with gpg signature verification
    • or just making the jigdo RPM just Requires all the packages and fetch that pkglist from rpmmd
  2. leave package_diff and cached* API business separate for now; they need to always work for Cockpit even on commits without the new rpmdb pkglist and they download /usr/share/rpm -- we can look to unify this with the deploy_transaction_execute flow afterwards so that it uses the new pkglist if available, otherwise falls back?
  3. teach the deploy transaction the needed logic to support the [check] mode (i.e. turn on commit metadata only, refresh rpmmd, heuristically try to find updates to layered pkgs). [download] is already supported by --download-only.
  4. enhance the CachedUpdate property in a backcompatible manner to also include rpm diff and make deploy transaction update that during non-deploy mode.
  5. teach status to read CachedUpdate property and display the relevant info
  6. ship systemd timer & service that calls upgrade with a hidden --auto=$MODE switch with MODE coming from e.g. /etc/rpm-ostree-automatic.conf.

Other considerations:

  • where to keep:
    1. last update check
      • bump timestamp on a file in /var/cache?
    2. auto-update policy setting
      • /etc/rpm-ostree-automatic.conf?
  • how should the systemd timer & auto-update mode be managed? purely by systemctl and e.g. vi /etc/rpm-ostree-automatic.conf, or should rpm-ostree provide a wrapper for it? leaning more towards the former.

Mock-up status outputs:

$ rpm-ostree status
State: idle, automatic updates enabled (download)
Deployments:
● atomicws:fedora/x86_64/workstation
                   Version: 26.230 (2017-10-15 03:11:00)
                BaseCommit: b8503c69c36591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
           LayeredPackages: krb5-workstation libvirt-client mosh sshpass strace tmux

Pending update:
            Version: 26.241 (2017-11-28 12:09:24)
             Commit: abcdef12344591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
               Diff: 12 upgrades, 2 downgrades, 2 removals, 1 addition

$ rpm-ostree status --verbose
State: idle, automatic updates enabled (download)
Deployments:
● atomicws:fedora/x86_64/workstation
                   Version: 26.230 (2017-10-15 03:11:00)
                BaseCommit: b8503c69c36591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
           LayeredPackages: krb5-workstation libvirt-client mosh sshpass strace tmux

Pending update:
            Version: 26.241 (2017-11-28 12:09:24)
             Commit: abcdef12344591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
           Upgraded: 12 packages
                     |- asdf 1.23.213 -> 5.12.23
                     ...
                     `- rtyu 2.4 -> 12.3
                     (includes both tree updates and layering updates)
         Downgraded: 2 packages
                     |- zxcv-2.1.23
                     ...
                     (just includes tree updates)
            Removed: 2 packages
                     |- zxcv-2.1.23
                     ...
                     (just includes tree updates)
              Added: 1 packages
                     |- zxcv-2.1.23
                     ...
                     (just includes tree updates)

# when there are security updates:

$ rpm-ostree status
...
Pending update:
            Version: 26.241 (2017-11-28 12:09:24)
             Commit: abcdef12344591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
    SecurityUpdates: 2 packages (kernel, openssh-clients)    [[BOLDED RED]]
               Diff: 2 upgrades

$ rpm-ostree status --verbose
...
Pending update:
            Version: 26.241 (2017-11-28 12:09:24)
             Commit: abcdef12344591606c11743abdfeb5591c1ae8d9c3c69c18a583071b3b7caf3f
    SecurityUpdates: 2 packages
                     |- kernel
                     |  |- <list of available references & URLs>
                     |  `- ...
                     `- openssh-clients
                        |- <list of available references & URLs>
                        `- ...
           Upgraded: 2 packages
                     |- kernel 1.2.3 -> 4.5.6
                     `- openssh-clients 1.2 -> 3.4

@jlebon jlebon changed the title integrate optional systemd timer for individual host automatic upgrades 🆕⌚️ Automatic updates Dec 5, 2017
@jlebon jlebon added the jira for syncing to jira label Dec 5, 2017
@dustymabe
Copy link
Member

In that status output I think Available Update vs Pending Update would probably be more appropriate especially if we haven't staged a deployment. The we should probably list the state of the update: not downloaded, downloaded, deployed and staged for next reboot. We can come up with more succinct words to describe those states.

@jlebon
Copy link
Member

jlebon commented Dec 7, 2017

Definitely, we need to describe the state as well. Another interesting piece of information that would be worth displaying is the size of the download. Interestingly, this is something we can easily calculate for jigdo remotes. In the ostree remote case, we can only display that if there are static deltas.

@dustymabe
Copy link
Member

ok. one other thing I wonder if we're covering: automatic rollbacks based on some conditions. If we enable automatic updates including the reboot then we should at least think about automatic rollbacks in case of some sort of failure. For this we can only do so good since the mechanism that triggers the rollback would depend on the system coming up at least somewhat, but it is something I'd love to see us brainstorm.

@jlebon
Copy link
Member

jlebon commented Dec 14, 2017

Right, this is #177. I'm open to discuss whether to hide the reboot mode until that's supported. At the very least, we'd need a warning of some sort to make that clear. OTOH, I don't want to completely not support reboot because of that either. E.g. I don't mind taking on the risk for my pet home servers. :)

@dustymabe
Copy link
Member

Right, this is #177.

cool

@jlebon
Copy link
Member

jlebon commented Jan 5, 2018

WIP in #1147.

@cgwalters
Copy link
Member Author

OK so let's try to agree on what happens with the "first cut" of this. Are we thinking that we'll land this but it will just be disabled by default and people who want it can opt-in for now?

I'm generally OK with that. But there are definitely issues in turning on even check by default. A good example of a past conversation is around including fedora-motd in Atomic Host.

Now a good thing here is we're not triggering the updates out of PAM. But we still have the problem for example that a whole lot of people need to configure a proxy.

What I'd like to see for example is adding the notion of "auto-cancellable transactions" or so. Basically if while the rpm-ostree upgrade --automatic timer is running, I do rpm-ostree override remove or whatever, I don't want to get an error and have to rpm-ostree cancel.

Further a whole big conceptual issue the degree to which our systemd units are "special". We also need to support e.g. gnome-software, Cockpit, and also Ansible at least; @jlebon mentioned that in

I feel like between all of these steps, at least for the desktop we need to think about having gnome-software be in control of triggers. Similarly for server side, Ansible control for blue/green.

I think in the "personal desktop" case it's pretty clear gnome-software could just frob the settings in the config file (do we own the polkit gateway for that? expose an API?)

BTW down the line for the "CSB laptop" case I'd actually like to support a mode where if e.g. someone has their laptop suspended/turned off for a month while they go on vacation, when they boot up Internet access is disabled for everything except rpm-ostree upgrades until they get updated. I'm sure some people would despise this idea but if we make updates fast and painless we can get a lot closer to having both security and convenience.

@cgwalters
Copy link
Member Author

(Actually for the desktop case implementing that is probably a gnome-software thing given flatpaks need updating too)

@kalev
Copy link
Contributor

kalev commented Jan 19, 2018

Instead of a config file, I think it may be easier to have gnome-software drive the automatic updates over dbus -- it already has a session service specifically for that purpose. This way it could also make sure that base OS and flatpak updates are applied at the same time, reducing user interruptions etc. Would that make sense?

@jlebon
Copy link
Member

jlebon commented Jan 19, 2018

That makes sense and is part of the design in #1147. Basically, gnome-software could just turn off the timer and call AutomaticUpdateTrigger() at its leisure. We don't support a deploy mode right now since it wouldn't make sense from a timer without fixing #40 first. But in an "update & reboot" model, #40 is less relevant, and we can add support for that. (Of course, that can be done today as well with the code in #1147 by just using a follow-up UpdateDeployment() in cache-only mode.)

@kalev
Copy link
Contributor

kalev commented Jan 19, 2018

That sounds great! Let me see if I can quickly hack up gnome-software to make use of the new goodness and then report back on Monday or so.

@dustymabe
Copy link
Member

@kalev, is there any sort of gnome-software cli? gnome-software incorporates rpms, faltpaks, firmware, ostree??, it would be really nice to have something like that on my Atomic Host (not workstation) system in a cli form to report potential updates and allow me to choose what to install. related discussion in #405 (comment)

@jlebon
Copy link
Member

jlebon commented Jan 19, 2018

Now a good thing here is we're not triggering the updates out of PAM. But we still have the problem for example that a whole lot of people need to configure a proxy.

True. But that's something they'd have to set up to upgrade manually as well, right?

What I'd like to see for example is adding the notion of "auto-cancellable transactions" or so. Basically if while the rpm-ostree upgrade --automatic timer is running, I do rpm-ostree override remove or whatever, I don't want to get an error and have to rpm-ostree cancel.

That's a really interesting idea. Let's split that out into its own issue once #1147 is merged?

Further a whole big conceptual issue the degree to which our systemd units are "special". We also need to support e.g. gnome-software, Cockpit, and also Ansible at least

It depends what you mean by "support Ansible" here but yeah, configuring automatic updates can be done solely through editing /etc/rpm-ostreed.conf, and enabling the timer e.g. with the service module.

Cockpit could do something similar to the rpm-ostree client, querying the AutomaticUpdatePolicy D-Bus property and checking the systemd service to know when it last ran?

Basically, since we're just using systemd units and a config file, we're in pretty familiar territory for any application that wants to control this stuff. There is no stored state elsewhere that's exclusively managed by rpm-ostree. One example of this is that rpm-ostree status just queries systemd for information. Another one is that #1147 doesn't automatically enable the timer on startup if the AutomaticUpdatePolicy config is not set to off. E.g. an ansible playbook can just do service: enabled=no name=rpm-ostreed-automatic.timer to make sure automatic updates are off regardless of what's in the config file.

@peterbaouoft
Copy link
Contributor

peterbaouoft commented Jan 30, 2018

Hi, I had a try for the autoupdate. It works nicely for me =). I do have a few questions though (hopefully you won't mind =) ). Note: the test output might be long (but content should not be that much). I also did not read many of the comments above, so if I happen to miss something, please let me know =P

1: When apply the auto-update patch, rpm-ostree status does take noticeably longer than before. Is that expected?

[root@localhost ~]# time rpm-ostree status    
State: idle; auto updates disabled
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man

real	0m0.040s
user	0m0.023s
sys	0m0.005s

vs

[root@localhost ~]# time rpm-ostree status -v 
State: idle; auto updates enabled (check; last run unknown)
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
                    Commit: ab75f9249820bd6c32e16ebbf9947322b484aaa9d4164cf573bc7480a1c2a22b
                 StateRoot: fedora-atomic
              GPGSignature: 1 signature
                            Signature made Tue Jan 16 16:35:22 2018 using RSA key ID F55E7430F5282EE4
                            Good signature from "Fedora 27 <fedora-27@fedoraproject.org>"
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
                    Commit: bfb5f4147f4b9aa6d5b0277ec337ee38871cedbcc2e97721609f242f15d3b37c
                 StateRoot: fedora-atomic
              GPGSignature: 1 signature
                            Signature made Wed Jan 17 15:52:59 2018 using RSA key ID F55E7430F5282EE4
                            Good signature from "Fedora 27 <fedora-27@fedoraproject.org>"
           LayeredPackages: man

Available update:
       Version: 27.61 (2018-01-17 15:52:47)
        Commit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
  GPGSignature: 1 signature
                Signature made Wed Jan 17 15:52:59 2018 using RSA key ID F55E7430F5282EE4
                Good signature from "Fedora 27 <fedora-27@fedoraproject.org>"
      Upgraded: docker 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27
                docker-common 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27
                docker-rhel-push-plugin 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27

real	0m25.050s
user	0m0.022s
sys	0m0.010s

2: It seems like I have to do an upgrade --preview in order to make rpm-ostree status show the available update, is that the expected behavior?

[root@localhost ~]# rpm-ostree status
State: idle; auto updates disabled
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
[root@localhost ~]# vi /etc/rpm-ostreed.conf 
[root@localhost ~]# cat /etc/rpm-ostreed.conf 
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# For option meanings, see rpm-ostreed.conf(5).

[Daemon]
AutomaticUpdatePolicy=check
#IdleExitTimeout=60
[root@localhost ~]# rpm-ostree reload
[root@localhost ~]# time rpm-ostree status
State: idle; auto updates enabled (check; last run unknown)
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man

real	0m25.064s
user	0m0.024s
sys	0m0.006s
[root@localhost ~]# rpm-ostree upgrade --preview
1 metadata, 0 content objects fetched; 569 B transferred in 0 seconds
Enabled rpm-md repositories: updates fedora

Updating metadata for 'updates': [=============] 100%
rpm-md repo 'updates'; generated: 2018-01-29 17:58:29

Updating metadata for 'fedora': [=============] 100%
rpm-md repo 'fedora'; generated: 2017-11-05 05:51:47

Importing metadata [=============] 100%
Available update:
       Version: 27.61 (2018-01-17 15:52:47)
        Commit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
  GPGSignature: 1 signature
                Signature made Wed Jan 17 15:52:59 2018 using RSA key ID F55E7430F5282EE4
                Good signature from "Fedora 27 <fedora-27@fedoraproject.org>"
      Upgraded: docker 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27
                docker-common 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27
                docker-rhel-push-plugin 2:1.13.1-42.git4402c09.fc27 -> 2:1.13.1-44.git584d391.fc27

[root@localhost ~]# time rpm-ostree status   
State: idle; auto updates enabled (check; last run unknown)
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man

Available update:
       Version: 27.61 (2018-01-17 15:52:47)
        Commit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
  GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
          Diff: 3 upgraded

real	0m25.069s
user	0m0.022s
sys	0m0.010s

3: Last question, how do I generate a test output so that last run is no longer unknown in the status?

Other than that, the functionality looks nice =). Sorry it took long, had to spend time understanding the testing procedure. And this is the complete test log if you are interested:
https://paste.fedoraproject.org/paste/F~Nxr4I7w3j3QSctno~jbQ ( also seems long, read with caution)

@jlebon
Copy link
Member

jlebon commented Jan 31, 2018

Thanks @peterbaouoft for trying it out! :)

1: When apply the auto-update patch, rpm-ostree status does take noticeably longer than before. Is that expected?

Ahh, you're probably hitting fedora-selinux/selinux-policy-contrib#45. You can either use the same hack we use in the testsuite, or just setenforce 0.

2: It seems like I have to do an upgrade --preview in order to make rpm-ostree status show the available update, is that the expected behavior?

Right. The reload only reloads the configuration. The actualy check for updates happens according to the the rpm-ostreed-automatic.timer. You can also do rpm-ostree upgrade --trigger-automatic-update-policy to force a check.

3: Last question, how do I generate a test output so that last run is no longer unknown in the status?

That's due to the SELinux policy issue above.

@peterbaouoft
Copy link
Contributor

Ahh, you're probably hitting fedora-selinux/selinux-policy-contrib#45. You can either use the same hack we use in the testsuite, or just setenforce 0.

Yup, applying setenforce 0 does make it a lot faster, and seems like also solve the unknown status problem. 2 birds with one stone! =P

[root@localhost ~]# time rpm-ostree status
State: idle; auto updates enabled (check; no runs since boot)
Deployments:
* ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.60 (2018-01-16 16:35:15)
                BaseCommit: 972e5a8158b610fec80f3f73f3372b7bea2b841038f2e246aa7623dbf5b5a751
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man
                  Unlocked: development

  ostree://fedora-atomic:fedora/27/x86_64/atomic-host
                   Version: 27.61 (2018-01-17 15:52:47)
                BaseCommit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: man

Available update:
       Version: 27.61 (2018-01-17 15:52:47)
        Commit: 772ab185b0752b0d6bc8b2096d08955660d80ed95579e13e136e6a54e3559ca9
  GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
          Diff: 3 upgraded

real	0m0.047s
user	0m0.024s
sys	0m0.008s

The actualy check for updates happens according to the the rpm-ostreed-automatic.timer. You can also do rpm-ostree upgrade --trigger-automatic-update-policy to force a check.

I see, makes sense. Thanks for the explanation! I am more and more excited about this new feature now(auto-update)! =D

@jlebon
Copy link
Member

jlebon commented Feb 8, 2018

The counter here though is that some use cases want to show e.g. how much will be downloaded before actually doing it. And particularly with layering involved we can't do that until we depsolve. Which...hm maybe is your check phase. So I guess we do need that.

Yeah, I think there's a lot of use cases where you don't want your updater to auto-download in the background. E.g. for FAW, I'd feel comfortable shipping with check by default, but not download/prepare. The depsolve issue is indeed unfortunate but not terrible. I think in the great majority of cases, our heuristics will work. Perfect is the enemy of good. :)

I'm keeping this whole comment here since it might be useful, hopefully?

I think it helps to reason out things explicitly to make sure we're going the right way!

There's a pretty huge semantic difference between reboot and anything not-reboot.

To get back to this, I do see where you're coming from. I think in that case, I'd rather we not ship such a timer at all for now?

My initial thoughts before were to add some of these "policy engine"-like settings to rpm-ostree itself, such as "auto reboot only for security erratas", or "auto reboot, but not for layered packages". I still think there's some value in doing this for the lone server/IoT case, because even though it's not very hard to implement manually, it makes things really easy to configure OOTB. But I guess that should be a separate discussion from whether to have a dumb reboot policy at all. So I'd vote for leaving this out for now until we gain more experience in the managed workflows like GNOME Software and cluster cases.

@ashcrow
Copy link
Member

ashcrow commented Apr 10, 2018

I tend to think anything that will reboot the node needs to be handled outside of the daemon directly. The daemon itself (unless I'm mistaken) isn't aware of how many other nodes it lives with and can't initiate a restart without that possibility of downtime. Instead, having rpm-ostree be in a state noting that it's ready to apply it's update (or that updates are available) seems ideal. Then the external controller can make intelligent decisions based on state.

Edit: s/agent/daemon/g

@cgwalters
Copy link
Member Author

Yeah...it's tempting to take the reboot mode out of rpm-ostreed entirely but I actually am still today using the timer linked at the top that just does rpm-ostree upgrade -r on my home server, just accepting the downtime. I should probably switch now to the reboot policy but eh.

This all ties back into the (just posted) https://pagure.io/atomic-wg/issue/453

@ashcrow
Copy link
Member

ashcrow commented Apr 10, 2018

I think having the -r is fine. The more I think about it the policy idea sounds good as well ... but should default to the least surprising setting. Part of external management systems would be to ensure that the policy is set to a download so it can reliably control when the deployment occurs.

@jlebon
Copy link
Member

jlebon commented Apr 10, 2018

One question is whether rpm-ostreed-automatic initiates new deployment creation (as suggested in the WIP that proposes a new stage policy: #1321), or the agent. The former is clearly useful also for single node/workstation cases. Though in the cluster case, an argument for the latter is that the agent is a better place to embed policy engine style settings. E.g. I'm not sure we want to cause updates across the whole cluster if only a utility layered pkg was updated.

@ashcrow
Copy link
Member

ashcrow commented Apr 10, 2018

@jlebon isn't rpm-ostreed-automatic still dependent on what policy is set? If so, we could document how one could set their non agent managed nodes by changing policy. If they are using the agent then the agent could verify/set the proper policy it expects. It would follow as:

  1. A single node/group of nodes: We default to downloading updates (or doing nothing)
  2. A single node/group of nodes with auto deploy on: We download and deploy with a reboot
  3. A single node/group of nodes managed by an agent: We download updates and defer to the agent to tell us when to deploy and reboot

This is a tricky subject though. My initial feeling is to put as much orchestration in the agent and as little in rpm-ostree. What keeps me from outright pushing for that is any agent that is used will likely be tied to a specific orchestration system or tool. If we try to make a generic agent then we are basically providing an interface and, to me, that would seem more at home in rpm-ostree anyway.

@cgwalters
Copy link
Member Author

This is a tricky subject though. My initial feeling is to put as much orchestration in the agent and as little in rpm-ostree. What keeps me from outright pushing for that is any agent that is used will likely be tied to a specific orchestration system or tool. If we try to make a generic agent then we are basically providing an interface and, to me, that would seem more at home in rpm-ostree anyway.

Yeah, that's the core tension. I guess my core feeling is let's not delete anything that exists in rpm-ostreed today, but I would vote that the Kube agent initiates updates itself rather than relying on the timer.

@jlebon
Copy link
Member

jlebon commented Apr 11, 2018

So, from discussions here, I think what we want is "both". I.e. we do want a "stage" mode that rpm-ostreed knows about and enacted by the timer. E.g. that's something I'd love to have on my workstation. But we also want to be fully compatible with agents that want to take over all aspects of node management, including when stage deployments are created (and obviously when to reboot).

We could slice this further even into node agents that could still rely on rpm-ostree's "check" mode to know that a node has an update vs a more controlled environment where the "update available" signal comes directly to the agent OOB from some other metadata protocol (in which case, the rpm-ostree timer/policy is completely off).

@jlebon
Copy link
Member

jlebon commented Apr 11, 2018

Yeah...it's tempting to take the reboot mode out of rpm-ostreed entirely but I actually am still today using the timer linked at the top that just does rpm-ostree upgrade -r on my home server, just accepting the downtime. I should probably switch now to the reboot policy but eh.

Note that reboot is not actually supported right now. Depending on how we want to implement https://pagure.io/atomic-wg/issue/453 re. the single node case, it might make sense to add it (and e.g. let that be the default we ship with in Fedora). Though maybe not if there's a bunch of "policy" type things we want to account for (e.g. "are any users logged in and how long have they been idle for?"). I think I'd rather have that logic live somewhere else.

@ashcrow
Copy link
Member

ashcrow commented Apr 11, 2018

So, from discussions here, I think what we want is "both". I.e. we do want a "stage" mode that rpm-ostreed knows about and enacted by the timer. E.g. that's something I'd love to have on my workstation.

To clarify, stage mode would download updates and have them ready for deployment (not actually deploy) correct?

But we also want to be fully compatible with agents that want to take over all aspects of node management, including when stage deployments are created (and obviously when to reboot).

👍

Note that reboot is not actually supported right now. Depending on how we want to implement https://pagure.io/atomic-wg/issue/453 re. the single node case, it might make sense to add it (and e.g. let that be the default we ship with in Fedora). Though maybe not if there's a bunch of "policy" type things we want to account for (e.g. "are any users logged in and how long have they been idle for?"). I think I'd rather have that logic live somewhere else.

That makes sense. Assuming staging means downloading and being ready to deploy, I say lets get that in. Having a timer to deploy and reboot that's configurable is fine too as long as we can disable the timer deploy portion. Being able to disable the auto staging would be a nice to have but could be added at a later time.

@ashcrow
Copy link
Member

ashcrow commented Apr 12, 2018

Do we have a path forward on this?

@cgwalters
Copy link
Member Author

That makes sense and is part of the design in #1147. Basically, gnome-software could just turn off the timer and call AutomaticUpdateTrigger() at its leisure.

I think if we do this though I'd like to have something like:

rpm-ostree upgrade --trigger-automatic-update-policy=timer
rpm-ostree upgrade --trigger-automatic-update-policy=gnome-software

And the daemon then tracks (somewhere) the name passed. The idea here is that then

# rpm-ostree status
State: idle; auto updates enabled (stage, agent=gnome-software)

So administrators understand what's going on. And we should probably explicitly throw an error if the built-in timer is enabled and anything else executes the auto-update policy.

This "tracking the last agent" though only works after things have run at least once. But I think that's OK.

@cgwalters
Copy link
Member Author

🤔 Though...today we probably could use sd_pid_get_unit() (or sd_pid_get_user_unit()) to work this out automatically.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue May 15, 2018
The high level goal is to render in a better way what caused an
update: coreos#247 (comment)

This gets us for Cockpit:
`Initiated txn DownloadUpdateRpmDiff for client(dbus:1.28 unit:session-6.scope uid:0): /org/projectatomic/rpmostree1/fedora_atomic`
which isn't as good as I'd hoped; I was thinking we'd get `cockpit.service`
but actually Cockpit does invocations as a real login for good reason.

We get a similar result from the CLI.
rh-atomic-bot pushed a commit that referenced this issue May 16, 2018
The high level goal is to render in a better way what caused an
update: #247 (comment)

This gets us for Cockpit:
`Initiated txn DownloadUpdateRpmDiff for client(dbus:1.28 unit:session-6.scope uid:0): /org/projectatomic/rpmostree1/fedora_atomic`
which isn't as good as I'd hoped; I was thinking we'd get `cockpit.service`
but actually Cockpit does invocations as a real login for good reason.

We get a similar result from the CLI.

Closes: #1368
Approved by: jlebon
@jlebon
Copy link
Member

jlebon commented Nov 26, 2020

In FCOS and RHCOS now, automatic updates are driven by higher-level software like Zincati and the MCO. It's likely there will be more work to make integrating with automatic update drivers like this in the future. But it's unlikely that we will switch to a model where rpm-ostree takes the full responsibility of the automatic update mechanism because it's highly context dependent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants