-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
install: Add ensure-completion
verb, wire up ostree-deploy → bootc
#915
Conversation
This seems to be working well in my hand-rolled testing. Still TODO:
|
f116a60
to
0425c61
Compare
OK I've rebased this on top of #860 and we successfully pull LBIs at Anaconda install time too now. |
There were a surprising number of things I hit. One of them for example is that anaconda's hand-rolled chroot/container doesn't mount |
21467e0
to
4b111dd
Compare
ensure-completion
verbensure-completion
verb, wire up ostree-deploy → bootc
The plus side of this PR as is is that it has near-zero risk unless explicitly turned on in the two places right now. The anaconda I've given this a fair bit of manual testing, but I think what will help here is to get this into e.g. Fedora rawhide and that'll get things running through the daily integration testing. |
https://github.com/cgwalters/playground/blob/main/202411/inst.sh and inst.ks are how I was testing this |
Broken link |
Hmm, does it work for anyone else? |
I think the repo is just private |
Duh sorry, here's the raw scripts. But yes ideally these bits are generalized into virt-manager/virt-manager#739
|
4b111dd
to
20e68b2
Compare
Rebased 🏄 |
@omertuc mind reviewing? |
OK no traction, @vrothberg can you do it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry that I didn't say it explicitly earlier: I don't have enough knowledge to review this. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial superficial review, will post another for completion.rs
20e68b2
to
ebe736e
Compare
Pull Request is not mergeable
50c8e1a
to
34e630c
Compare
34e630c
to
6aa5c19
Compare
Ah, CI fell over due to missing |
Anyways, this one is updated. |
So we now have:
and
The latter is for running from anaconda, and takes no arguments Are we happy with these names? Can you update the commit messages to mention both options / the difference between them? IIUC bootc-install-completion is always just invoked by bootc itself, it's not meant for users (I guess this is why it's under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review still incomplete but posting anyway to get things moving faster
The internals one is totally internal and not something a human should ever invoke, that's why I didn't add it even to the commit message. |
When bootc was created, it started to become a superset of ostree; in particular things like `/usr/lib/bootc/kargs.d` and logically bound images. However...Anaconda today is still invoking `ostree container image deploy`. Main fix -------- When bootc takes over the `/usr/libexec/ostree/ext/ostree-container` entrypoint, make the existing `ostree container image deploy` CLI actually just call back into bootc to fix things up. No additional work required other than getting an updated bootc in the Anaconda ISO. Old Anaconda ISOs ----------------- But, a further problem here is that Anaconda is only updated once per OS major+minor - e.g. there won't be an update to it for the lifetime of RHEL 9.5 or Fedora 41. We want the ability to ship new features and bugfixes in those OSes (especially RHEL9.5). So given that we have a newer bootc in the target container, we can do this: ``` %post --erroronfail bootc install ensure-completion %end ``` And will fix things up. Of course there's fun $details here...the way Anaconda implements `%post` is via a hand-augmented `chroot` i.e. a degenerate container, and we need to escape that and fix some things up (such as a missing cgroupfs mount). Summmary -------- - With a newer bootc in the ISO, everything just works - For older ISOs, one can add the `%post` above as a workaround. Implementation details: Cross-linking bootc and ostree-rs-ext ------------------------------------------------------------- This whole thing is very confusing because now, the linkage between bootc and ostree-rs-ext is bidirectional. In the case of `bootc install to-filesystem`, we end up calling into ostree-rs-ext, and we *must not* recurse back into bootc, because at least for kernel arguments we might end up applying them *twice*. We do this by passing a CLI argument. The second problem is the crate-level dependency; right now they're independent crates so we can't have ostree-rs-ext actually call into bootc directly, as convenient as that would be. So we end up forking ourselves as a subprocess. But that's not too bad because we need to carry a subprocess-based entrypoint *anyways* for the Anaconda `%post` case. Implementation details: /etc/resolv.conf ---------------------------------------- There's some surprising stuff going on in how Anaconda handles `/etc/resolv.conf` in the target root that I got burned by. In Fedora it's trying to query if systemd-resolved is enabled in the target or something? I ended up writing some code to just try to paper over this to ensure we have networking in the `%post` where we need it to fetch LBIs. Signed-off-by: Colin Walters <walters@verbum.org>
Testing my understanding: The reason
uses the image's bootc and not the bootc that ships with the OS that anaconda runs on, is because %post's "degenerate container" is actually a chroot into the filesystem that we just installed the image to (using ostree container image deploy), thus ensuring the new bootc binary is available there? |
// crates. We need an option to skip though so when the *main* | ||
// bootc install code calls this API, we don't do this as it | ||
// will have already been handled. | ||
if !options.skip_completion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that all future users of ostree container image deploy
(where options.skip_completion
is false
) will have bootc completion running automatically for them?
What about users doing it on non-bootc images? For missing InstallConfiguration (kargs) I see we use Option::into_iter() which will make it a no-op
Similarly for bound images, it will just be an empty list
So it should work in theory, but did you try this in practice? that the command is happy with completely non-bootc images? (Maybe this is covered by CI already somehow?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm basically afraid that this PR will introduce a regression / unexpected errors in users of ostree container image deploy
with non-bootc images, so it would be nice to ensure it doesn't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't explicitly test that yet, and we should. However note that none of this code will run in practice until the moment when we take over from the version in rpm-ostree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yes basically the heart of this right now is looking for files (kargs, bound image symlinks) and acting on them if present, and if they don't it's a no-op.
Other than the two open reviews, lgtm fwiw |
Exactly! |
OK then just #915 (comment) needs to be addressed |
6aa5c19
to
c5852ad
Compare
For reference, this does work for LBIs but doesn't for kargs with anaconda; see https://gitlab.com/fedora/bootc/tests/bootc-workflow-test/-/merge_requests/555#note_2284993995 It does work for non Anaconda cases that directly invoke In theory, we could change The LBI support is most important to wedge in to older Anacondas I think, and otherwise we should just proceed with fixing newer Anacondas. |
Thats.... unfortunate. Its the exact use case that broke for me. :/ Though I have a workaround... |
When bootc was created, it started to become a superset of ostree;
in particular things like
/usr/lib/bootc/kargs.d
and logicallybound images.
However...Anaconda today is still invoking
ostree container image deploy
.Main fix
When bootc takes over the
/usr/libexec/ostree/ext/ostree-container
entrypoint, make the existing
ostree container image deploy
CLI actuallyjust call back into bootc to fix things up. No additional work required other
than getting an updated bootc in the Anaconda ISO.
Old Anaconda ISOs
But, a further problem here is that Anaconda is only updated once
per OS major+minor - e.g. there won't be an update to it for the lifetime
of RHEL 9.5 or Fedora 41. We want the ability to ship new
features and bugfixes in those OSes (especially RHEL9.5).
So given that we have a newer bootc in the target container, we can
do this:
And will fix things up. Of course there's fun $details here...the
way Anaconda implements
%post
is via a hand-augmentedchroot
i.e. a degenerate container, and we need to escape that and
fix some things up (such as a missing cgroupfs mount).
Summmary
%post
above as a workaround.Implementation details: Cross-linking bootc and ostree-rs-ext
This whole thing is very confusing because now, the linkage
between bootc and ostree-rs-ext is bidirectional. In the case
of
bootc install to-filesystem
, we end up calling into ostree-rs-ext,and we must not recurse back into bootc, because at least for
kernel arguments we might end up applying them twice. We do
this by passing a CLI argument.
The second problem is the crate-level dependency; right now they're
independent crates so we can't have ostree-rs-ext actually
call into bootc directly, as convenient as that would be. So we
end up forking ourselves as a subprocess. But that's not too bad
because we need to carry a subprocess-based entrypoint anyways
for the Anaconda
%post
case.Implementation details: /etc/resolv.conf
There's some surprising stuff going on in how Anaconda handles
/etc/resolv.conf
in the target root that I got burned by. InFedora it's trying to query if systemd-resolved is enabled in
the target or something?
I ended up writing some code to just try to paper over this
to ensure we have networking in the
%post
where we needit to fetch LBIs.
Signed-off-by: Colin Walters walters@verbum.org