Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support character devices (whiteouts) in ostree commits #2712

Closed
mangelajo opened this issue Sep 12, 2022 · 19 comments
Closed

Support character devices (whiteouts) in ostree commits #2712

mangelajo opened this issue Sep 12, 2022 · 19 comments

Comments

@mangelajo
Copy link
Contributor

mangelajo commented Sep 12, 2022

This is an enhancement request to have ostree support the commit storage
of whiteout 0:0 character devices.

Motivation

Embedding container storage into ostree images allows for offline, and fast container
startup (no need to contact a registry) for ostree based systems where the containers
can be stored in the native container storage format within the ostree image.

overlayfs in the linux kernel implements the deletion of a file from one layer to the
next by creating a char 0:0 file (whiteout) in the new layer.

But today, ostree does not support the storage of character devices:

rm -rf rootfs test
ostree --repo=./test init --mode=bare-user
mkdir rootfs
mknod -m 0 rootfs/whiteout c 0 0
ostree commit --repo=./test ./rootfs --branch test/stable/x86_64

results:

error: Not a regular file or symlink: whiteout

This limits our ability to encode container storage into ostree images.

References

This necessity has been brought up in the past, and from other projects that build on os

Alternatives

  1. Compacting images into a single layer

This would remove the introduction of whiteout files and would not increase the storage size
since ostree is a content-addressed storage system, but would modify the sha256 hash
of images that sometimes are used as references.

i.e. quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba44dead03ea74107f90d58525106fb52d27a120b73c6cc8e2be31d37043ca1c

  1. Avoiding deletion between layers

This is technically non feasible in some cases (next layer or final containers expects a file
not to exist), and hard to manage over time, it also precludes including off-shelf existing
images where the builder of the ostree has no control over the image building.

Possible issues

This will probably not be usable when a new ostree layers is being built from within a container, since you can't
create a whiteout inside an overlayfs mount (assumption here).

@cgwalters
Copy link
Member

This would remove the introduction of whiteout files but would increase the storage size
exponentially (depending on the number of shared layers) for systems where many
images share common layers).

No it wouldn't, because ostree is a content-addressed storage system.

@mangelajo
Copy link
Contributor Author

This would remove the introduction of whiteout files but would increase the storage size
exponentially (depending on the number of shared layers) for systems where many
images share common layers).

No it wouldn't, because ostree is a content-addressed storage system.

ack, removing it, but there is another con to compacting the images. Will the sha256 hashes sometimes used to reference the images change?

@mangelajo
Copy link
Contributor Author

@cgwalters by content-addressed you mean it deduplicates files when the content is the same for two files?

@cgwalters
Copy link
Member

@cgwalters by content-addressed you mean it deduplicates files when the content is the same for two files?

Yes.

@mangelajo mangelajo changed the title Support character devices (whiteouts) in ostree commit files Support character devices (whiteouts) in ostree commis Sep 12, 2022
@mangelajo mangelajo changed the title Support character devices (whiteouts) in ostree commis Support character devices (whiteouts) in ostree commits Sep 12, 2022
@mangelajo
Copy link
Contributor Author

This would in theory be a partial revert of 62a8963 and 125889f

@mangelajo
Copy link
Contributor Author

This would in theory be a partial revert of 62a8963 and 125889f

I'm preparing a PR for this so we can check out if it works or not. I will also try test how the container registry transport works with this.

Not straight forward as the commits are almost 10 years old :) ostree has gone a very long way

@mangelajo
Copy link
Contributor Author

This would in theory be a partial revert of 62a8963 and 125889f

I'm preparing a PR for this so we can check out if it works or not. I will also try test how the container registry transport works with this.

Not straight forward as the commits are almost 10 years old :) ostree has gone a very long way

My most naive comment for 2022.. :) ostree has really gone a very long way since 2013.

@mangelajo
Copy link
Contributor Author

Ok, after looking at this, I would like to propose we reduce the scope of this proposal to supporting only char 0:0 devices used by overlayfs to signal whiteouts at low level:

This would eliminate the need for introducing additional fields on the encoding of bare-user metadata, test and maintain backwards compatibility, and also minimize the amount of changes that need to be done.

An older ostree would be unable to checkout a tree with char 0:0 devices but I believe that would be all.

@cgwalters do you believe that would make sense?

@cgwalters
Copy link
Member

This would eliminate the need for introducing additional fields on the encoding of bare-user metadata,

It's more that there is no space for new fields in file metadata stored by ostree today; it was somewhat intentionally non-extensible, because extensibility gets into tension with content-based addressing (needs to have a canonical form, etc.)

So indeed such an encoding would need to be defined in the existing space. Now, we could say that if the file mode says it's a character device, assume it's a whiteout with 0:0 st_rdev.

And loosen the commit process to accept only those devices.

There's a bunch of code that will need touching for this, not just in ostree core but also in e.g. https://github.com/ostreedev/ostree-rs-ext/tree/main/lib/src/tar

@cgwalters
Copy link
Member

but would modify the sha256 references of images that sometimes are used as references.
i.e. quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba44dead03ea74107f90d58525106fb52d27a120b73c6cc8e2be31d37043ca1c

Note though that in this use case, podman on the client system is not fetching the image. It is already fetched, and it is in read only storage. Hence, there's no need for it to validate the digest of anything.

Taking this down further of course opens the question of whether you really want to use podman at all versus directly invoking crun or systemd-nspawn.

One approach perhaps is for podman to grow support for a read-only data storage which contains a mapping of digested pull specs to already unpacked and ready directory trees.

mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 14, 2022
@dhellmann
Copy link

but would modify the sha256 references of images that sometimes are used as references.
i.e. quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba44dead03ea74107f90d58525106fb52d27a120b73c6cc8e2be31d37043ca1c

Note though that in this use case, podman on the client system is not fetching the image. It is already fetched, and it is in read only storage. Hence, there's no need for it to validate the digest of anything.

Taking this down further of course opens the question of whether you really want to use podman at all versus directly invoking crun or systemd-nspawn.

One approach perhaps is for podman to grow support for a read-only data storage which contains a mapping of digested pull specs to already unpacked and ready directory trees.

This is a problem for MicroShift, where we're using kubelet, crio, etc. No podman.

@cgwalters
Copy link
Member

cgwalters commented Sep 14, 2022

This is a problem for MicroShift, where we're using kubelet, crio, etc. No podman.

Ultimately anything in this space would live in e.g. containers/storage which is shared between the two.

@cgwalters
Copy link
Member

cgwalters commented Sep 14, 2022

To be clear: I am not personally opposed to supporting whiteouts. But it's not going to be a trivial fix, and I'd be a bit concerned about trying to ship it to e.g. RHEL8 today.

For this reason, I think the "flattening" path is likely the better bet in a short term scenario. It's also more efficient - in the "whiteouts on client" model, anything removed via whiteouts still gets shipped over the wire in the ostree commit. In the "flatten at build time" model, the client doesn't pointlessly fetch deleted files. A particularly relevant thing here is e.g. rpm-software-management/rpm#2005

The tradeoff is definitely that some modification/tweaking of the container runtime is potentially needed.

@mangelajo
Copy link
Contributor Author

mangelajo commented Sep 14, 2022

Yes, in github.com/openshift/microshift we are inheriting the image problems related to OCP/rpm's , and I agree not shipping deleted files would be an improvement.

MicroShift uses unmodified OCP images extracted from every ocp release image listing as we rebase MicroShift over time.

The problems we would have with flattening:

  1. The sha hash would change, I guess there is no solution for that.
  2. If we created flat public versions of the images, it would work well for MicroShift as ostree embedded container storage, but would not work non-embedded downloaded over the net ...

We also considered storing archived containers in ostree and syncing them to the crio storage before MicroShift boots, but that's a time consuming / IO intensive operation to put on Edge devices, and would duplicate the storage (I need to check if the hash integrity of the image is preserved in such case)

Let's keep thinking and see if we can get workable alternatives,

I'm continuing with the PR in the meanwhile, I want to cleanup and add testing. I'm sure there's plenty to be improved or simply which I didn't get right as I probably cannot get familiar enough with the ostree codebase in the time I have available. It would be completely fine if in the end that PR needs to be thrown away for a different implementation, at least I hope it can help figuring out if there is any technical reason preventing us to do this.

mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 16, 2022
@cgwalters
Copy link
Member

A new idea that came out of discussion here:

  • During the build process (or as part of ostree commit) add support for changing whiteouts into e.g. .ostree-wh- regular files; this will allow shipping them as is using existing ostree tooling (and will work with ostree container encapsulate to a container image)
  • Add ostree checkout --process-ostree-whiteouts (or just do this by default for system deployments) to convert them back into whiteout char devices

mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 19, 2022
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.

Related-Issue: ostreedev#2712
@mangelajo
Copy link
Contributor Author

A new idea that came out of discussion here:

  • During the build process (or as part of ostree commit) add support for changing whiteouts into e.g. .ostree-wh- regular files; this will allow shipping them as is using existing ostree tooling (and will work with ostree container encapsulate to a container image)
  • Add ostree checkout --process-ostree-whiteouts (or just do this by default for system deployments) to convert them back into whiteout char devices

#2717

This is how it would look, for the lack of overwrite and selinux policy handling and testing.

mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 27, 2022
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.

The CI scripts now uses a volume for the scratch directories
previously in /var/tmp otherwise we cannot create whiteout
devices into an overlayfs mounted filesystem.

Related-Issue: ostreedev#2712
mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 27, 2022
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.

The CI scripts now uses a volume for the scratch directories
previously in /var/tmp otherwise we cannot create whiteout
devices into an overlayfs mounted filesystem.

Related-Issue: ostreedev#2712
mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 28, 2022
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.

The CI scripts now uses a volume for the scratch directories
previously in /var/tmp otherwise we cannot create whiteout
devices into an overlayfs mounted filesystem.

Related-Issue: ostreedev#2712
mangelajo added a commit to mangelajo/ostree that referenced this issue Sep 28, 2022
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.

The CI scripts now uses a volume for the scratch directories
previously in /var/tmp otherwise we cannot create whiteout
devices into an overlayfs mounted filesystem.

Related-Issue: ostreedev#2712
(cherry picked from commit e234b63)
@cgwalters
Copy link
Member

This was fixed in #2722

@t-moe
Copy link

t-moe commented Apr 26, 2023

@cgwalters
If I'm reading this correctly, we now have support for restoring whiteouts at checkout.
But in order to create a commit containing whiteouts, one must use ostree container encapsulate API.

Could we also add support for changing whiteouts into .ostree-wh. regular files directly to the ostree commit CLI?
We're trying to commit uncompressed docker images (e.g. stored under /usr/lib/docker) to an ostree repo. This currently still fails with error: Not a regular file or symlink...

@cgwalters
Copy link
Member

Could we also add support for changing whiteouts into .ostree-wh. regular files directly to the ostree commit CLI?

Yep it'd make sense, though one can do it today as a pre-pass before committing too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants