Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crane: Should be able to specify multiple platforms to 'crane pull' #1399

Closed
RedHawker opened this issue Jun 28, 2022 · 9 comments
Closed

crane: Should be able to specify multiple platforms to 'crane pull' #1399

RedHawker opened this issue Jun 28, 2022 · 9 comments
Labels
bug Something isn't working

Comments

@RedHawker
Copy link

Describe the bug

Running crane pull IMAGE DIR --format=oci --platform=linux/arm64 --platform=linux/amd64 --platform=linux/arm will only pull the linux/arm image.

Whereas running with --platform=all results in pulling more platforms than necessary for our system. (We only have AMD64, ARM64, and ARM hardware at this time)

To Reproduce

Run crane pull weaveworks/weave-kube:2.8.1 temp_dir --platform=linux/arm64 --platform=linux/amd64 --platform=linux/arm --format=oci
Run grep -r amd64 temp_dir and it won't return anything.
Run grep -r arm temp_dir gives a result.

Expected behavior

Running crane pull IMAGE DIR --format=oci --platform=linux/arm64 --platform=linux/amd64 --platform=linux/arm results in pulling the 3 different architectures, not just the last one.

Additional context

Add any other context about the problem here.

  • Output of crane version: 0.10.0
  • Registry: Private (local) Registry 2 or 2.8.1
@RedHawker RedHawker added the bug Something isn't working label Jun 28, 2022
@imjasonh
Copy link
Collaborator

I think we have a few options here:

  1. Fail if multiple --platforms are passed. This doesn't address your issue, but it's at least better than the confusing behavior of silently only pulling the last specified platform.
  2. Accept multiple --platforms, and write an index manifest that only includes the specified platforms (fail if a specified platform isn't present in the index). This has the unfortunate side effect of changing the contents and digest of the index manifest, which might be confusing later.
  3. Accept multiple --platforms, and write the original unchanged index manifest, but only pull matching images (fail if a specified platform isn't present in the index). This results in a "sparse index", where the index retains pointers to images which may not be present in the OCI layout. The downside is that some registries may not accept this. 😕

Among the three I'd say (3) is probably my preference. Registry compatibility is a concern, but I believe language was recently added to the specs to allow this, and registries should update to allow it. If your registry doesn't support it (I don't know if registry:2 does), --platform=all will still work, with some added storage/latency.

@RedHawker
Copy link
Author

For 2) what do you mean when you say, "which might be confusing later"?

Maybe I'm trying to use crane for a purpose that it wasn't intended for? (We're on a resource-constrained system with limited storage space available to the registry, so pruning down what we're storing only to the necessary platforms is necessary.)

We currently pull each image for each platform using docker pull then use the docker manifest command set to build up a multi-arch manifest, I was hoping we could utilize crane to simplify that workflow.

@imjasonh
Copy link
Collaborator

The potentially confusing this is that if you crane pull multi-arch-image@sha256:abc... --platform=a,b, and as a result it updates the manifest to only include platforms a,b, the digest of that manifest won't be the digest of the manifest you pulled. If there are signatures/SBOMs/etc attached to sha256:abc, those won't be attached to your modified manifest, even though the content is presumably still valid.

This behavior should also extend beyond crane pull, to crane cp and likely crane push too (e.g., to pull all manifests, then only push some), and crane validate, maybe others, so whatever approach we take should also make sense in those commands.

Given that registries are supposed to handle sparse indexes, and that would solve your storage constraint problems, I think that's the way forward. We'll probably need to do some surgery inside the guts of pkg/v1/remote or pkg/v1 to handle this filtering.

cc @jonjohnsonjr

@jonjohnsonjr
Copy link
Collaborator

  1. Accept multiple --platforms, and write an index manifest that only includes the specified platforms

You can kind of achieve this today:

$ crane pull --format=oci --platform=linux/amd64 ubuntu layout/
$ crane pull --format=oci --platform=linux/arm64 ubuntu layout/
$ crane push --index layout/ gcr.io/jonjohnson-test/ubuntu:pruned
$ crane manifest gcr.io/jonjohnson-test/ubuntu:pruned | jq
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 529,
      "digest": "sha256:bace9fb0d5923a675c894d5c815da75ffe35e24970166a48a4460a48ae6e0d19",
      "annotations": {
        "dev.ggcr.image.name": "ubuntu"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 529,
      "digest": "sha256:0f744430d9643a0ec647a4addcac14b1fbb11424be434165c15e2cc7269f70f8",
      "annotations": {
        "dev.ggcr.image.name": "ubuntu"
      }
    }
  ]
}

The problem is that we're dropping platform (and potentially other fields from the descriptor) when appending to index.json. We might want to preserve those, but I'm not sure if we always want to preserve those?

I wonder if we should have some flags for crane pull --format=oci to control this behavior... I don't love it but what we're doing now seems suboptimal.

@jonjohnsonjr
Copy link
Collaborator

Put together #1400 but we might want to make our "dev.ggcr.image.name": "ubuntu" annotation optional...

@RedHawker
Copy link
Author

If I could do that above process, and have it work, that would work for my purposes.

@jonjohnsonjr
Copy link
Collaborator

With #1400 merged things should work but you might want to wait for #1401 to resolve if you care about the weird annotation we're currently adding. Alternatively it's straightforward to use jq or something to drop it by directly manipulating index.json.

@jonjohnsonjr
Copy link
Collaborator

Alright and now we should be good to go, closing this because I think you're happy with that but re-open if not!

@RedHawker
Copy link
Author

Just tested out the change, it works great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants