Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

OCI manifests blocks auto image update feature #3629

Closed
2 tasks done
mochja opened this issue Aug 2, 2022 · 8 comments
Closed
2 tasks done

OCI manifests blocks auto image update feature #3629

mochja opened this issue Aug 2, 2022 · 8 comments
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed bug

Comments

@mochja
Copy link

mochja commented Aug 2, 2022

Describe the bug

Hello,

we have recently encountered issue with OCI images and ACR combination.

We are starting to move to the OCI images and if we push such image to our registry flux will completely block auto-release of the new images (even if newer image is docker one).

tldr;
We found out that flux does not send application/vnd.oci.image.manifest.v1+json accept request header and most likely is unable to parse such image manifest and get's stuck in an update loop where it tries to inspect OCI images.

Image repositories that do not contain OCI based images are auto-releasing without issues side-by-side.

Steps to reproduce

  1. setup flux with automated deployments from ACR
  2. build and push docker image to the registry and let flux release this new image
  3. build and push OCI image and see the error message in the logs
  4. build and push another docker image and confirm that the image is not released
  5. remove OCI image from the registry and watch flux to release 2nd docker image

Expected behavior

Flux continues with automated image update even if OCI image is in the registry, either skipping the image release for OCI images or releasing both docker and oci images.

Kubernetes version / Distro / Cloud provider

AKS 1.22.11

Flux version

Flux v1.25.2

Git provider

No response

Container Registry provider

ACR

Additional context

Log Output

ts=2022-07-26T07:57:44.541933853Z caller=repocachemanager.go:226 component=warmer canonical_name=XXX.azurecr.io/XXX auth="{map[XXX.azurecr.io:<registry creds for 00000000-0000-0000-0000-000000000000@XXX.azurecr.io, from /docker-config/config.json>]}" err="unknown: some resources specified could not be found" ref=XXX.azurecr.io/XXX:XXX

Failed Request

image

Headers Sent

image

Missing Header

Accept: application/vnd.oci.image.manifest.v1+json

When the missing header is added to the request, registry will respond with 200 OK status code.

Registry metadata

I got these using skopeo inspect --raw.

Image built with docker

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 6759,
    "digest": "sha256:33e1c74ca607f8d4fad70beb26c1d818a3089324ed6005758e572b3c1ecc88fb"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 26678859,
      "digest": "sha256:fea43f71b0b0e24e9f10fc13902b90f6640005a677f28cd0d0c8f3cc617c2537"
    },
    {

Image built with podman

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:8d8f245fcec7e540b3ca52ab6c094a195cf14ece57df35fa285f3c37642afbd9",
    "size": 4820
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:c814d0dc6a670ffb72cb7f2bed388b9a08d6db764c9173a936a5bbd6862b5d1a",
      "size": 2902097
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:b690f3b5dd7ae85a12794f7144bf31e85b03d89889dfcd432a187e3b2316e52a",
      "size": 1894091
    },

Maintenance Acknowledgement

  • I am aware of Flux v1's maintenance status

Code of Conduct

  • I agree to follow this project's Code of Conduct
@mochja mochja added blocked-needs-validation Issue is waiting to be validated before we can proceed bug labels Aug 2, 2022
@kingdonb
Copy link
Member

kingdonb commented Aug 2, 2022

Please be aware that Flux v1 is in maintenance.

No new features can be added, as all feature development takes place on the Flux v2 project now.

Are you pushing OCI images to the same repo as Docker images? (Do the OCI images contain manifests vs. Docker images which contain the application?) And how are they versioned together?

I'm hoping to understand the use case a bit better from someone who is using it in the wild, regardless of the maintenance status of Flux v1, I'd like to help you understand ways this issue can be worked around if possible. But in general, I need to recommend upgrading to Flux v2 to every person that comes here, (and a separate priority will be to establish whether or not this issue can affect Flux v2 users.)

@mochja
Copy link
Author

mochja commented Aug 2, 2022

Thank You, @kingdonb! Yes, we are very well aware of the maintenance mode. This issue impacted us heavily as we had to remove all OCI based images from the registry to be able to have flux working.

Are you pushing OCI images to the same repo as Docker images?

Yes.

Do the OCI images contain manifests vs. Docker images which contain the application?

They are 1:1, just different standard - both (docker/oci) images are multi-platform and contains application

And how are they versioned together?

They use separate integer sequence for tags for other reasons, but in general, we try to make them same as other tools supports them natively (docker, k8s). We use glob based versioning PREFIX-* for auto image updates in flux.

We are moving out from Flux v1, but there is still long way to go.

@kingdonb
Copy link
Member

kingdonb commented Aug 2, 2022

We don't want Flux v1 to block OCI adoption. I will think about this. It sounds like a pretty trivial change is all that is needed to unblock this behavior, and it would change nothing substantively about the behavior of Flux.

Those are two very bold statements for someone who has done as little research about this topic as I have for now, but I don't think there's much that can go wrong simply adding an Accept: header if that's all the change that is needed.

Maybe a flag could enable the accept header to be extra cautious. We are very careful about making changes to Flux v1, as many people likely auto-upgrade to keep up with releases for security, so the guarantee for no breaking changes covers basically every aspect of Flux v1 while it is in maintenance mode.

Thank you for raising this to our attention, and if a PR can be merged that doesn't break anything, I will have to consider it.

@mochja
Copy link
Author

mochja commented Aug 2, 2022

@kingdonb fyi. I don’t know if adding the request header will be enough. I haven’t found such headers in the codebase, the only thing I found was parsing manifests here
https://github.com/fluxcd/flux/blob/master/pkg/registry/client.go#L116
I guess an additional case for the oci image manifest might be required as well as adding the accept headers using https://github.com/distribution/distribution/blob/b5ca020cfbe998e5af3457fda087444cf5116496/registry.go#L83
may work, if we want to flux support such manifests.

on the other note, is the actual block of auto image update, in case of “missing part of image” by design? I could imagine that if we cannot determine the complete state of an image repository we would want to disable the auto image update.

@kingdonb
Copy link
Member

kingdonb commented Aug 3, 2022

That is entirely possible, and that is a part of the image update design in Flux v1. It must take every image in the repository (assuming there haven't been any image filters enabled that would exclude some of them) and download the metadata of those images, then put them precisely in order, and if the metadata for any image is missing, then the automation will fail (because it cannot be determined if one of the images with missing metadata is the "newest" image that should be used.)

I don't know precisely why adding OCI images to the same repo would block this, but I do know that we have many users still on Flux v1 and nobody has asked for this before.

Sorry for my ignorance but if you will entertain the question, what precisely changes when you start to publish OCI images to your repository? (Is it a flag added to docker push or docker build, or is it a totally different image producer, or ...)

I have only considered using OCI as a storage format, aka ORAS, and only as it's been adopted by Flux (and Helm) – for shipping manifests. I had not considered that there might be a reason to use OCI images instead of "regular Docker images" to ship your application / runtime software image. So I don't quite understand the breadth of possibilities added when your application is distributed "as OCI images" in a way that would require this Accept header, (and what else I do not know.)

@mochja
Copy link
Author

mochja commented Aug 3, 2022

@kingdonb we are adopting podman which builds OCI images by default and we did switch to --format=docker due to this issue, which frankly after removing all OCI artefacts works again.

@kingdonb
Copy link
Member

kingdonb commented Aug 3, 2022

Got it, so it was a side-effect that you use OCI images (not necessarily a design requirement) and there is a workaround, at least in the scope of your specific manifestation of the issue. This will be helpful for anyone else that stumbles onto this issue while we work out whether any change is needed or possible here 👍

@kingdonb
Copy link
Member

I'm going to close this issue, as we will be archiving the fluxcd/flux repository soon:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed bug
Projects
None yet
Development

No branches or pull requests

2 participants