Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadlet: add -list-images #23065

Closed
wants to merge 1 commit into from

Conversation

vrothberg
Copy link
Member

Add a new -list-images string flag Quadlet which will write a list of images mentioned in .image Quadlets to specified file.

In order to avoid disk bloat, many bootable containers do not embed application-container images. Hence, those must be pulled in some use-case dependent way. Quadlet is perfectly capable of pulling the images on boot or on demand but that does not satisfy all use cases as it slows down boot or initialization of the workloads. In those cases, we want a bootc upgrade to first pull down the bootable container image and then analyze its root FS to extract all referenced images in (rootful) .image Quadlets. This change is a first step in this direction.

Does this PR introduce a user-facing change?

Add a `-list-images` flag to Quadlet to list images referenced in `.image` Quadlets.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note labels Jun 21, 2024
Copy link
Contributor

openshift-ci bot commented Jun 21, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2024
@vrothberg
Copy link
Member Author

@cgwalters, we also need another flag that alters the root / to something customizable. I can add that to this PR (or a follow-up one) but first wanted to make sure we're on the same page.

Add a new `-list-images` string flag Quadlet which will write a list of
images mentioned in `.image` Quadlets to specified file.

In order to avoid disk bloat, many bootable containers do not embed
application-container images.  Hence, those must be pulled in _some_
use-case dependent way.  Quadlet is perfectly capable of pulling the
images on boot or on demand but that does not satisfy all use cases as
it slows down boot or initialization of the workloads.  In those cases,
we want a `bootc upgrade` to first pull down the bootable container
image and then analyze its root FS to extract all referenced images in
(rootful) `.image` Quadlets.  This change is a first step in this
direction.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why write to a file? Normal stdout should be all that is needed I think.

@Luap99
Copy link
Member

Luap99 commented Jun 21, 2024

As for custom root I don't see why this is something quadlet/podman would need to handle, bootc can just spawn the quadlet process in the correct rootfs/mountns I think.

@vrothberg
Copy link
Member Author

Why write to a file? Normal stdout should be all that is needed I think.

Writing to stdout would work as well but I found writing to a file to be more future-proof. In case debug or logs will be added in the future.

As for custom root I don't see why this is something quadlet/podman would need to handle, bootc can just spawn the quadlet process in the correct rootfs/mountns I think.

Yes, that would be an alternative.

@cgwalters
Copy link
Contributor

cgwalters commented Jun 21, 2024

As for custom root I don't see why this is something quadlet/podman would need to handle, bootc can just spawn the quadlet process in the correct rootfs/mountns I think.

There's positives and negatives to that approach. A slight negative is that it generally implies we'll be running a potentially new quadlet binary from the new root. So far we've generally avoided trying to execute code from the new root in general in bootc/ostree. The rationale is doing so increases "skew" possibilities where e.g. things in the new root actually expect the new kernel (think major OS upgrades, etc.)

In practice of course, the rise of containerization has meant that most of userspace ends up needing to have compat with maintained LTS/enterprise kernels; the chance that e.g. a new quadlet binary doesn't work on RHEL8's kernel when we're going from RHEL8 to 9 for example is probably close to nil, as it's basically just reading some files in Go.

Things will get more interesting though if we take the path (as I'd like to do) of also teaching quadlet to help "pin" these images by creating stopped containers for them. That will inherently involve a lot more of the "container runtime" code. But probably what would work is to just get the list from the new root, and then we can run the current root's podman to do pinning from there.

For these reasons, I personally would have made it so quadlet accepts a directory fd or starts from its own working directory; it's cheaper and easier to unit test.

@@ -586,6 +587,31 @@ func process() error {
}
}

if len(listImagesFlag) > 0 {
fd, err := os.OpenFile(listImagesFlag, os.O_WRONLY, 0644)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but best practice to wrap in a bufio.BufWriter right?

@@ -681,4 +707,5 @@ func init() {
flag.BoolVar(&isUserFlag, "user", false, "Run as systemd user")
flag.BoolVar(&dryRunFlag, "dryrun", false, "Run in dryrun mode printing debug information")
flag.BoolVar(&versionFlag, "version", false, "Print version information and exit")
flag.StringVar(&listImagesFlag, "list-images", "", "Write a list of all images being references in .image Quadlets to specified file")
Copy link
Contributor

@cgwalters cgwalters Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're likely to want to add something like a new flag to opt-in to pinning like
Bootc=bound or Preload=true or Pin=true (however we call it)

I think we'll need to either:

  • Add some sort of --matches=Pin=true flag to this list operation
  • Just return the entire .image file?

I think the second part is important: Consider the case where I want to use a custom pull secret just for some images. In that general case today, we need the AuthFile flag, etc.

This type of thing implies actually that what we will need as a next step is a way to pass the output of this tool (probably a full .image file) into something like podman pull --from-systemd-image or something`?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Podman should not parse quadlet files. That is breaking the well build separation we have. And quite frankly it cannot work as systemd untis may contain systemd specifiers that must be expanded first and this is not logic we should try to mirror because we will fail to match it 1 to 1.
And even if you try to parse that on bootc you will have the same issue, the AuthFile=%h/somefile you have to know what %h is. And then you also have to deal with env var expansions and so on.

As such I fail to see how we can generate the command exactly the way systemd would execute it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifiers are a complex topic indeed. I think %h (homedir) specifically is out of scope for bootc's use case; we should not be interacting with user containers or home directories.

Would people use e.g. %a (architecture) or %l (hostname) or other things like that in these .image files? Maybe...but, honestly I'd also be OK if we just errored out if you used those things for Pin=true .image files today. I think we'd just need to handle "detect unquoted % in input and error out".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to combine these then, what we could output instead is say JSON, with a list of podman pull ... commands that we would run, and again error out if we find specifiers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's a bit messy because if we also want to do pinning via podman create --rm <image>...wait, is there no variant of create that is explicitly ephemeral i.e. its state is tracked in /run? I guess that's not fatal, but it'd be quite nice to have as it means the core logic would just be "on boot, re-pin all images we need to pin" instead of that plus an extra step of "also GC any pin containers that aren't needed anymore".

Copy link
Member Author

@vrothberg vrothberg Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to extract some requirements from the conversation:

  1. We want to pre-pull images after an bootc upgrade
  2. Not all .image files may need to be pre-pulled, so we need some kind of opt-in mechanism
  3. The .image files may use custom pull secrets, so Quadlet should ideally have generated units for the file with a fully expanded podman pull
  4. Respecting the separation of concerns, Quadlet should not leave its purpose of a systemd-generator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to pre-pull images after an bootc upgrade

It's subtler than that I think. We want to "logically bind", and that means that bootc must not queue a new bootloader entry unless the logically bound images have been pulled and are known to be present.

A whole big point of bootc (and ostree before it, and image based systems in general) is having "transactional updates" where the system is known to either be in state A or B.

If we did the naive thing of:

  • bootc upgrade (queuing a new bootloader entry)
  • pull images

Then some tool doing a reboot in between these two phases would create a state like "B, but without app images" which is not desired.

This subtlety is why this is hard to implement without changes to both bootc and podman today.

@Luap99
Copy link
Member

Luap99 commented Jun 21, 2024

For these reasons, I personally would have made it so quadlet accepts a directory fd or starts from its own working directory; it's cheaper and easier to unit test.

I guess given that we just read files it would be simple enough. However if we have to setup mount namespaces and such it gets very tricky with the goroutine != thread default behaviours.

@rhatdan
Copy link
Member

rhatdan commented Jun 21, 2024

Just a thought, but any reason we are not listing the Image field in a .Container quadlet?

@ygalblum
Copy link
Collaborator

You'll probably not agree with me, but I have to say I didn't like this solution when I started looking at this PR and reading @cgwalters's comments, I like it even less.
Quadlet is a systemd generator. The only flags added to it are for debugging. Now, this PR starts piggybacking on it with additional requirements. I like that we try to maintain separation of concerns. Can't this be implemented in a separate executable?

@vrothberg
Copy link
Member Author

vrothberg commented Jun 24, 2024

Quadlet is a systemd generator. The only flags added to it are for debugging. Now, this PR starts piggybacking on it with additional requirements. I like that we try to maintain separation of concerns. Can't this be implemented in a separate executable?

I share your concern about the separation of concerns. I want to change Quadlet as little as possible. Listing .image files and extracting information in Quadlet seems more attractive to me than re-implementing it's parsing logic though; that's concern of Quadlet IMO. Let's see where the conversation goes.

@ygalblum
Copy link
Collaborator

First, a lot of the parsing is implemented in a separate package and what's not can probably be moved somewhere. So, we won't re-implement Quadlet just reuse some of the packages it uses.

An example of my concern is this line from the PR description:

Quadlet is perfectly capable of pulling the images on boot or on demand

This statement is incorrect. Quadlet does not pull images or run containers. What it does is translate the files it supports into systemd service unit files.

@vrothberg
Copy link
Member Author

vrothberg commented Jun 24, 2024

This statement is incorrect. Quadlet does not pull images or run containers. What it does is translate the files it supports into systemd service unit files.

That was just vagueness on my end, sorry. I am fully aware of how Quadlet works, so I meant the services generated by Quadlet can pull images.

@cgwalters
Copy link
Contributor

One thing I'd leave open as a high level possibility here is adding Go to bootc...and if we do that we could choose in the short term to copy the quadlet code. Taking a step farther down, we could later try to factor it out into a shared package.

A lot of tradeoffs there of course...

@ckyrouac
Copy link

ckyrouac commented Jun 24, 2024

Can't this be implemented in a separate executable?

Given the desire to keep the scope of the quadlet binary to only be a systemd generator, this seems like the most sensible approach to me. Something like quadlet-utils or quadlet-cli. The primary downside to this approach is it will be yet another binary to maintain. I don't think that downside is too significant though.

@cgwalters
Copy link
Contributor

How about a quick sync meeting on this to get rough consensus?

@vrothberg
Copy link
Member Author

How about a quick sync meeting on this to get rough consensus?

Created one on Thursday ✔️

@Luap99
Copy link
Member

Luap99 commented Jun 25, 2024

How about a quick sync meeting on this to get rough consensus?

Created one on Thursday ✔️

Please add me as well.


Overall this looks like something bootc specific that nobody else is going to use. I understand that you fine with good enough but I don't think it is good enough. Trying to mimic the systemd unit doesn't seem feasible. Saying systemd specifiers are not supported is not good and trying to reverse engineer systemd specifier expansion is not practical either. As such I really do not think this is practical outside the basic examples.

And to be honest if your goal is to pass the full files you might as well just use the existing quadlet -dryrun option and parse the output from there. That adds zero additional complexity to quadlet and you can just define what bootc supports or not on your side. The decision would not longer be up to quadlet/podman.

@cgwalters
Copy link
Contributor

Overall this looks like something bootc specific that nobody else is going to use.

I think we can define something generic that would work with other image based update systems if that's your concern.

I guess bigger picture I'd been envisioning close cooperation between the podman and bootc projects at an organizational and technical level. But that's just my vision and I'm getting the feeling it's not yours 😄

Backing up then, we can clearly not try to extend .image files at all (and more generally, not try to change podman or align) and introduce in bootc something like /usr/lib/bootc/images which could be a simple toml and not try to be unit file like for example, where bootc owns executing the code required to do the image pulling and pinning.

The user experience then would typically involve writing a foo.container podman unit, and a separate /usr/lib/bootc/images/foo.image file, distinct from the podman .image file. There would be a clear separation, reflecting the current clear separation between the two projects.

@Luap99
Copy link
Member

Luap99 commented Jun 25, 2024

Overall this looks like something bootc specific that nobody else is going to use.

I think we can define something generic that would work with other image based update systems if that's your concern.

I guess bigger picture I'd been envisioning close cooperation between the podman and bootc projects at an organizational and technical level. But that's just my vision and I'm getting the feeling it's not yours 😄

I mean I haven't made my mind up yet regarding how much bootc and podman should or should not be integrated. My vision is to keep things as simple as possible for us so I tend to lean towards not adding special bootc things into podman/quadlet. I am also not a fan of half backed solutions.
That doesn't mean I object out of principle, it would help to get our other maintainers involved as well in some larger meeting where you can lay out your long term vision of how things should work.

I am aware of #22785 but this only really touches on the one feature here.

@cgwalters
Copy link
Contributor

I am also not a fan of half backed solutions.

No one does; let's keep this focused and technical though please and make clear what you think needs improvement so we can reach rough consensus and working code. I thought the specifiers was a valid concern, but you didn't reply to my specific assertion there that the use cases for specifiers for .image files was low to nonexistent.

That said, the more I think about this, I'm leaning towards something like containers/bootc#640 (which is a lower level more general mechanism).

The key question though that would remain even after that is whether my initial intuition/suggestion that extending .image files for this use case makes sense. It feels natural I think if one starts from how the docs look today: https://docs.fedoraproject.org/en-US/bootc/running-containers/

But, I dunno; having /usr/lib/bootc/bound-images/foo.toml would honestly also probably be fine. And yes, there'd be no specifiers there since it wouldn't be a systemd unit-like file at all.

@vrothberg
Copy link
Member Author

But, I dunno; having /usr/lib/bootc/bound-images/foo.toml would honestly also probably be fine. And yes, there'd be no specifiers there since it wouldn't be a systemd unit-like file at all.

I begin to gravitate toward such a file as well. We can discuss details in the meeting tomorrow but it feels like a nice generic solution that can also tackle non-Quadlet use cases.

@Luap99
Copy link
Member

Luap99 commented Jun 26, 2024

No one does; let's keep this focused and technical though please and make clear what you think needs improvement so we can reach rough consensus and working code.

Well the fact the we cannot expand systemd specifiers is amjpr downside to me, it would also be unable to deal with any ExecStartPre/Post lines that, i.e. could be used to generate auth tokens... I agree that a mjaority of users would likely not need that but to me this just doesn't sound like good idea to commit to this just to then get complains from users relying on these things. If I would see a way for us to implement it in the future I wouldn't object but to me it seems impossible to reverse engineer the way system would call the command without actually letting system execute it.

But, I dunno; having /usr/lib/bootc/bound-images/foo.toml would honestly also probably be fine. And yes, there'd be no specifiers there since it wouldn't be a systemd unit-like file at all.

Adding separate file sounds good to me. I guess that could also be useful for the "pinning" thing where we could have a podman config option to read this file and not remove the images listed there.
Of course end users would need to deal with the duplicating in image names between the quadlet file and this special file but I don't think that would be a big issue.

@rhatdan
Copy link
Member

rhatdan commented Jun 26, 2024

Having the separate file would also help in the situations where quadlets only use .container or .kube to pull images.

I am not sure how/If something like MicroShift would preinstall container images as well.

@vrothberg
Copy link
Member Author

Agreed. There are use cases even outside of Quadlet which could benefit from a more generic solution.

@cgwalters
Copy link
Contributor

cgwalters commented Jun 26, 2024

I am not sure how/If something like MicroShift would preinstall container images as well.

This is a tangent, but an interesting one: today Microshift using ostree was explicitly relying on the fact that ostree never itself used whiteouts! I brought this up in ostreedev/ostree#2712

Now that we're using OCI on the wire, this concept of nesting (as you know) comes to the fore.

Also another tangent, I think it doesn't make sense to do RUN podman pull ... even for "physically" bound images, we should support a flow that basically takes the underlying layers, and renames all the files, plus a bit that adds all the metadata as a final layer - this would help ensure that we never re-pull unchanged layers even for "physically" bound images.

IOW it'd look like

[base image layer 1]
[base image layer 2]
...
[embedded content layer 1, but with all files included renamed to prefix with /usr/share/containers/storage/overlay/<blobid>]
...
[embedded layer with everything else in /usr/share/containers/storage *except* the layers]
...

@cgwalters
Copy link
Contributor

Whether microshift would want to switch from "physically bound" to "logically bound" images is an interesting topic, I don't know the answer. I think it'd make the most sense to do logically bound for microshift personally.

@ckyrouac
Copy link

It seems like a separate file would technically work well, I still have a hard time accepting a degraded user experience (even relatively minor). A separate file would mean the user would need to potentially duplicate all of these options to define the image. This can obviously be mitigated with tooling, still these minor inconveniences tend to add up.

In lieu of adding a bootc-bound-image field to the quadlet spec, another option we might consider is storing the quadlet image file in a subdirectory, e.g. /etc/containers/systemd/bootc-bound/my-app.image or something. Any image files in this directory would be considered bound to the bootc image. Then, bootc would call QUADLET_UNIT_DIRS=/etc/containers/systemd/bootc /usr/lib/systemd/system-generators/podman-system-generator to generate the systemd service to pull the image. In the case where the image is only defined on the container quadlet, we would require it to also be defined in an image quadlet in the bootc-bound dir. AFAICT the container spec doesn't support the entire set of options in the image spec, so this would not result in a lot of duplicated config.

You all are way more familiar with these projects than me so I'm sure I'm missing something here though.

@cgwalters
Copy link
Contributor

@baude any opinions on this?

@cgwalters
Copy link
Contributor

We had a realtime sync on this...which was wide ranging and interesting. Hopefully someone else can chime in with a broader summary but I think my takeaway-proposal here is:

Let's add a special /usr/lib/bootc/bound-images (in toml format say) for now, implement it entirely in bootc (leveraging containers/bootc#640 internally) and revisit later larger picture designs for how image "binding" in podman/containers stack might look like in general.

A good start here on the bootc side would just be:

  • Add the file format
  • Implement it only for bootc install as a mechanism to pre-fetch, without also integrating with bootc upgrade yet

That'd force us to work though associated problems like the bootc-image-builder bug I hit, thinking about how pull secrets work at install time too, etc. It'd even be generally useful of course to support pre-fetching at install time without also doing "binding"; something to consider in the bootc config format. For example we might have Mode=prefetch vs Mode=bind.

@vrothberg
Copy link
Member Author

Based on the conversations, I am going to close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants