Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upconvert remote schema 1 images #377

Closed
jonjohnsonjr opened this issue Feb 20, 2019 · 9 comments · Fixed by #408
Closed

Upconvert remote schema 1 images #377

jonjohnsonjr opened this issue Feb 20, 2019 · 9 comments · Fixed by #408
Labels
enhancement New feature or request

Comments

@jonjohnsonjr
Copy link
Collaborator

It's not too hard to convert a schema 1 image to a schema 2 image. We can do this whenever we encounter a schema 1 image in remote.Image in order to support pulling older images or pulling from registries that don't support schema 2 yet.

non-goal: Pushing schema 1 images. These are legacy and we shouldn't help keep them alive.

See: GoogleContainerTools/kaniko#509

It's expensive to do this conversion because we have to gunzip and sha256 each layer to compute the DiffIDs. In some cases we can do this lazily. In other cases, doing this expensive conversion is better than nothing.

@xingao267
Copy link
Member

xingao267 commented Mar 7, 2019

Hi @jonjohnsonjr, I'm looking at migrating rules_docker to base on this repo instead of the python version of it. Just to confirm the schema 1 you would like to upconvert is https://docs.docker.com/registry/spec/manifest-v2-1/ right, not https://github.com/moby/moby/blob/master/image/spec/v1.md? So basically, this go-containerregistry repo only supports https://docs.docker.com/registry/spec/manifest-v2-2? I got a little confused because of the naming of v1 folder in this repo.

@jonjohnsonjr
Copy link
Collaborator Author

I see, yeah that is confusing. The v1 folder here refers to the OCI version, e.g. https://github.com/opencontainers/image-spec/blob/master/media-types.md

The images we'd upconvert are v2 schema 1; we currently only support v2 schema 2 (AKA OCI images, kind of).

Why do you ask?

@xingao267
Copy link
Member

xingao267 commented Mar 7, 2019

Thanks for the confirmation. Our team maintains rules_docker, and we are thinking of migrating rules_docker from the python containerregistry to this one to get better support and solve the py2/3 compatibility issue altogether. I'm currently evaluating the differences between the two containerregistry repos and see if there are feature parity and estimate the gaps we need to fill, which means we might need to request features and sending PRs in this repo if needed. I'm also putting up an internal document. I'll share that with you once finished. Thanks!

@glyn
Copy link
Contributor

glyn commented Mar 11, 2019

I hit this with some Istio-related images (prom/statsd-exporter and quay.io/coreos/hyperkube).

@jonjohnsonjr Apart from the performance overhead, which is "pay as you go", you mentioned on slack that there are some disadvantages of implementing this issue. Please could you elaborate?

@jonjohnsonjr
Copy link
Collaborator Author

Schema 1 images are generally super ancient and will have tons of unaddressed vulnerabilities. The exception to this is quay.io, which for reasons I cannot understand didn't implement schema 2 support until very recently. Poking around, most images are still schema1 so it seems not to be completely rolled out yet?

It's mostly a philosophical point, but by not supporting schema 1 in this library, we're shining a light on unexpected dependencies. The prom/statsd-exporter:v0.6.0 gets dropped in the latest istio release, rightly.

Istios' hyperkube dependency is weirder, it seems like they just use it as a bash shell that happens to have kubectl installed. That doesn't seem to be the purpose of hyperkube, which AFAICT is basically busybox but for kubernetes API components.

I guess I'm reluctant to enable people to continue to depend on schema 1 images, because people should basically never be depending on schema 1 images, except for quay, which I think is being fixed.

FWIW I started playing with this a while ago to see how annoying it would be:
master...jonjohnsonjr:schema1

I don't want to merge that in because it's not well tested, but it seems to work as a workaround if my protests aren't convincing enough :)

@glyn
Copy link
Contributor

glyn commented Mar 12, 2019

Schema 1 images are generally super ancient and will have tons of unaddressed vulnerabilities. The exception to this is quay.io, which for reasons I cannot understand didn't implement schema 2 support until very recently. Poking around, most images are still schema1 so it seems not to be completely rolled out yet?

It's mostly a philosophical point, but by not supporting schema 1 in this library, we're shining a light on unexpected dependencies. The prom/statsd-exporter:v0.6.0 gets dropped in the latest istio release, rightly.

Istios' hyperkube dependency is weirder, it seems like they just use it as a bash shell that happens to have kubectl installed. That doesn't seem to be the purpose of hyperkube, which AFAICT is basically busybox but for kubernetes API components.

I guess I'm reluctant to enable people to continue to depend on schema 1 images, because people should basically never be depending on schema 1 images, except for quay, which I think is being fixed.

Of the published images for quay.io/coreos/hyperkube, the highest tag (v1.10.5_coreos.0, 2 Jul 2018) and the most recent tag (v1.9.6_coreos.2, 2 Feb 2019) still use schema v1. So it may be a while until that image is available with schema v2.

FWIW I started playing with this a while ago to see how annoying it would be:
master...jonjohnsonjr:schema1

I don't want to merge that in because it's not well tested, but it seems to work as a workaround if my protests aren't convincing enough :)

The conversion seems to work (slowly). However, not surprisingly, the image digest isn't preserved, which is another reason for not merging IMO.

@josephschorr
Copy link

Schema 1 images are generally super ancient and will have tons of unaddressed vulnerabilities. The exception to this is quay.io, which for reasons I cannot understand didn't implement schema 2 support until very recently. Poking around, most images are still schema1 so it seems not to be completely rolled out yet?

(Tech lead and cofounder of Quay here)

Long story aside as to why it has taken us significant time to rollout Schema 2 support (best told in a talk I intend to give at some point soon), even once Schema 2 support is fully rolled out, we are not going to be up-converting Schema 1 manifests into Schema 2 manifests, as it would change the content digests for existing manifests which are attached to tags.

The expectation, as per our read of the Schema 1 -> Schema 2 migration, has always been that clients will remain supporting both versions for a significant period of time after push support for Schema 1 has been deprecated and removed from Docker and other clients (which seems to be sometime this month for Docker).

If we find this to not be the case, we might reassess and expand our conversion code to return a Schema 2 form of Schema 1 manifests when supported, but as this would, as mentioned, change the manifest content digest, we'd highly prefer not to do so, as it will break image signing/verification by users.

@jonjohnsonjr
Copy link
Collaborator Author

Long story aside as to why it has taken us significant time to rollout Schema 2 support (best told in a talk I intend to give at some point soon)

Would love to watch that if it's public :)

we are not going to be up-converting Schema 1 manifests into Schema 2 manifests, as it would change the content digests for existing manifests which are attached to tags.

Yeah I definitely wouldn't expect upconversion -- not only would it break everything, but it would be pretty expensive to unzip and sha256 every single layer in order to compute diffids 😄

I believe my comment was around the time when things were just starting to be whitelisted, so none of the repositories I looked at on quay had any schema 2 images in them at the time. I see that's changed now 🎉

after push support for Schema 1 has been deprecated and removed from Docker and other clients (which seems to be sometime this month for Docker).

Do you have a reference to that deprecation? My cursory googling didn't turn up anything.

The expectation, as per our read of the Schema 1 -> Schema 2 migration, has always been that clients will remain supporting both versions for a significant period of time

Yeah, that's reasonable for existing clients, but it's hard to justify for new tools/clients IMO. Ideally, you could take the OCI specs and (for the most part) write a working client, but schema 1 carries a lot of historical baggage and additional complexity (especially w.r.t. signatures) that I'd love to drop. I'm still on the fence about supporting schema 1 on the pull path for this library (for older images), but I think we're almost at the point where it's not worth the effort, especially with quay now supporting schema 2 😄

Thanks!

@jonjohnsonjr
Copy link
Collaborator Author

Anyone coming here from the ErrSchema error message:

You might be interested in #1626, which implements enough of v1.Image to use with, e.g. mutate.Extract to read the filesystem of a schema 1 image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants