Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix oci to save the full image name in index.json #318

Merged
merged 1 commit into from
Aug 31, 2017

Conversation

umohnani8
Copy link
Member

Fix oci to save the full image name image:tag instead of just tag in index.json.
Add function to retrieve the image name from index.json when loading or pulling image from oci directory.

Signed-off-by: umohnani8 umohnani@redhat.com

@runcom
Copy link
Member

runcom commented Jul 31, 2017

I don't like this, we should have a way to have users specify additional annotations. That annotation is specified in the OCI image-spec here https://github.com/opencontainers/image-spec/blob/710038243d857231f17df1c3f4c10850154bd1f7/image-layout.md#indexjson-file (and later paragraph).
That stuff in the spec isn't a must of course, but I want to leave it that way (it's just a tag)

Implementor's Note: A common use case of descriptors with a "org.opencontainers.image.ref.name" annotation is representing a "tag" for a container image. For example, an image may have a tag for different versions or builds of the software. In the wild you often see "tags" like "v1.0.0-vendor.0", "2.0.0-debug", etc. Those tags will often be represented in an image-layout repository with matching "org.opencontainers.image.ref.name" annotations like "v1.0.0-vendor.0", "2.0.0-debug", etc.

Again, we should have a way to save additional annotations instead and save the full name iif needed.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 31, 2017

@runcom OTOH see https://github.com/opencontainers/image-spec/blob/master/annotations.md explicitly documenting a specific grammar.

As for additional annotations, dunno; there’s an inherent tension between c/image being general and treating all transports the same, and exposing everything there is to each transport. It’s not at all obvious to me how to generalize OCI annotations to anything else.

@umohnani8 umohnani8 force-pushed the oci_name branch 2 times, most recently from 5113603 to 4576bac Compare July 31, 2017 19:21
@umohnani8
Copy link
Member Author

@runcom we need the full image name when doing kpod load or kpod pull so that it is saved with the actual image name in containers-storage. Which Annotation would you suggest I use to store the full image name because "org.opencontainers.image.ref.name" seems the best one as the full image name is used to get the reference of the image.

@umohnani8 umohnani8 force-pushed the oci_name branch 2 times, most recently from d1581d4 to 45900f1 Compare August 1, 2017 15:01
@runcom
Copy link
Member

runcom commented Aug 1, 2017

LGTM since nothing in the spec prevents this https://github.com/opencontainers/image-spec/blob/master/annotations.md#pre-defined-annotation-keys

also, as @mtrmac pointed out, abstracting OCI annotation is something not-trivial right now so let's defer that.

Approved with PullApprove

@runcom
Copy link
Member

runcom commented Aug 1, 2017

@umohnani8 you need to fix tests failures though

@umohnani8
Copy link
Member Author

@runcom ok thanks, working on the test failures

@umohnani8
Copy link
Member Author

@mtrmac PTAL

@umohnani8
Copy link
Member Author

@mtrmac PTAL

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation mostly looks good, but the ”unspecified image name” case has non-trivial implications on the EDITnamingsemantics.

image = sep[1]
} else {
image = sep[1] + ":latest"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means

  1. $file:busybox == $file:busybox:latest
  2. $file == $file: (empty image name)

Is this what we want? Before this PR 1. would write busybox, not busybox:latest, in the annotation, and 2 would use latest. I somewhat lean towards keeping 1 at least; for 2, see elsewhere here…

@runcom ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we must keep 1 as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here, should it not add the latest tag when a tag is not given for the image?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you’re right this is unclear. Yes, that’s what I mean by 1: treat the image name as a string which as no further structure, unlike docker/reference where, in the extreme, busybox is a shorthand for docker.io/library/busybox:latest.

We’ve been treating the ref.name annotation as a Docker-like “tag”, and those default to latest; but now there’s the actual spec-by-regexp, which says nothing about existence or default values of tags.

(Not adding the :latest also keeps the behavior more, though not completely, compatible with the behavior before this PR.)

[Or is there pre-existing behavior, perhaps in umoci, we should co-opt here? I haven’t checked.]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the OCI transport should be able to read and write images with any ref-name allowed by the specification from an image-layout. Doesn't appending :latest here break that?

If compatibility with what the oci transport used to do is desired, the old behavior where oci: was treated as oci::latest entirely compatible with the intended use case of this patch. In the end second component of the scope isn't a tag or an image name, it's simply the value of the org.opencontainers.image.ref.name annotation - if you don't want to get 'latest' stuck in there, just pass in your own value.

But if you want to be forward looking, and consider trying to handle image layouts with no ref-names at all (you could havedifferent images distinguished by platform like a Docker manifestlist) then I'd think it would be better to break compat and not introduce 'latest' at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtrmac I get it now. The reason I was adding latest to the image name if a tag was not given was because of the way kpod and docker do things. For example, if a user downloads the alpine:latest image from a registry and decides to save to oci on disk using the command kpod save --oci -o alp.tar alpine, the image name that will be saved in index.json based on your comment will be alpine.

Therefore, when alp.tar is loaded back into containers-storage using the kpod load command, it will be stored as alpine instead of alpine:latest. I feel like this alters some form of the identity of the image as there would be no tag associated with it.

If we want to give the user the freedom of setting the image name to whatever they want it to be, we need to make it very clear in the documentation since all the kpod commands allow the user to access images with latest tag by just passing in the name.

Copy link
Contributor

@owtaylor owtaylor Aug 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umohnani8 what kpod does is, of course, separate from what the containers/image code does. Despite the name of this pull request, I don't think containers/image should be assuming that the ref-name in the OCI is "a full image name" - it's just a string following certain rules. The behavior that the string is the "full image name" is being added by kpod, so if you want the behavior that kpod save --oci -o alp.tar alpine stores a ref-name of alpine::latest then kpod would be the logical place to do that. To me, the command seems to be saying "look up the name alpine in storage and save that to alp.tar" and I'd expect the :latest to be added during the lookup process, not during the saving.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@owtaylor agreed, just realised that this was already being handled elsewhere - my bad. Will take out the image name modification from OCI.

{"", "latest"},
for _, image := range []struct{ suffix, image string }{
{":notlatest:image", "notlatest:image"},
{":latestimage", "latestimage:latest"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should still be a test for the “no :$image” case, and the case of a trailing : (i.e. an explicitly empty $image).

{"/dir1:notlatest", "/dir1:notlatest"}, // Explicit tag
{"/dir2", "/dir2:latest"}, // Default tag
{"/dir1:notlatest:notlatest", "/dir1:notlatest:notlatest"}, // Explicit image
{"/dir2:image", "/dir2:image:latest"}, // Default image
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should have a test for the “no :$image” case.

@@ -85,41 +84,44 @@ type ociReference struct {
// (But in general, we make no attempt to be completely safe against concurrent hostile filesystem modifications.)
dir string // As specified by the user. May be relative, contain symlinks, etc.
resolvedDir string // Absolute path with no symlinks, at least at the time of its creation. Primarily used for policy namespaces.
tag string
image string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If image == "" means ”the only image in the index” (which does seem to be necessary, but it would be nice to document this in the PR description, BTW), that should be documented here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented it.

@@ -178,7 +178,7 @@ func (d *ociImageDestination) PutManifest(m []byte) error {
}

annotations := make(map[string]string)
annotations["org.opencontainers.image.ref.name"] = d.ref.tag
annotations["org.opencontainers.image.ref.name"] = d.ref.image
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this happen even if d.ref.image == ""? https://github.com/opencontainers/image-spec/blob/master/annotations.md#pre-defined-annotation-keys seems to forbid empty names.

Should we just skip adding the annotation in that case? Or perhaps refuse to write an image without an user-specified name entirely?

@runcom WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say error out if name is empty, spec forbids it also

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec forbids it also

Where? I can’t see it; in fact the index specification says that all of annotations is optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the EBNF:

ref       ::= component ("/" component)*
component ::= alphanum (separator alphanum)*
alphanum  ::= [A-Za-z0-9]+
separator ::= [-._:@+] | "--"

You need to have at least one component which needs to have at least one alphanum.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That says that the annotation, if it is present, must not be empty; but the annotation can AFAICS legally be completely omitted. The complete dictionary can be missing:

annotations string-string map

This OPTIONAL property contains arbitrary metadata for the image index.

and even if the annotations dictionary exists, I can’t see anything requiring it to contain the org.opencontainers.image.ref.name key. Only if the dictionary and the ref.name key exists, does the EBNF require a non-empty name.

// LoadManifestDescriptor loads the manifest descriptor to be used to retrieve the image name
// when pulling an image
func LoadManifestDescriptor(imgRef types.ImageReference) (imgspecv1.Descriptor, error) {
return imgRef.(ociReference).getManifestDescriptor()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this exists only to get the name from one-item indexes? Or is this intended to be similar to docker.tarfile.Source.LoadTarManifest, which returns all of the images?

The forced .(ociReference) cast here is a bit ugly… at least it should be a checked cast, returning an error when the caller uses a different kind of ImageReference; but perhaps the functionality could instead be sliced differently, with the underlying implementation getting a (path string, image string), and getManifestDescriptor() extracting the two strings from an ociReference, and then (if the goal is really to only read one-image indexes) a LoadManifestDescriptor(path string) just setting image = "".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the type to ensure its of ociReference before casting it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use reflection here:

ociRef, ok := imgRef.(ociReference)
if !ok { fail }
return ociRef.getManifestDescriptor()

@@ -38,9 +38,9 @@ func TestTransportValidatePolicyConfigurationScope(t *testing.T) {
"/has/./dot",
"/has/dot/../dot",
"/trailing/slash/",
"/etc:invalid'tag!value@",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note: first the semantics need to be decided, the comments on the tests may become irrelevant if the semantics were to be changed. And yes, I am rather paranoid about test coverage of the naming — it is relevant to signature verification.)

The list of valid cases above this one should include a name with multiple colons, and a name with some slashes.

assert.Error(t, err)

_, err = NewReference(tmpDir, "invalid'tag!value@")
_, err = NewReference(tmpDir, "invalid'image!value@")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also should have a test for the image == "" case.

"/path:with/colons",
"/path:with/colons/and:tag",
"/path:with/colons/and:image",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should have a test for a trailing colon (i.e. an explicitly empty $image).

@@ -36,22 +36,21 @@ func (t ociTransport) ParseReference(reference string) (types.ImageReference, er
return ParseReference(reference)
}

var refRegexp = regexp.MustCompile(`^([A-Za-z0-9._-]+)+$`)
var refRegexp = regexp.MustCompile(`^([A-Za-z0-9:._-]+)+$`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please, let's follow it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to use the spec.

@umohnani8
Copy link
Member Author

@mtrmac @runcom I made some changes based on your feedback, PTAL. Working on the tests and waiting for some clarification on the other feedback.

if !refRegexp.MatchString(tag) {
return errors.Errorf("Invalid tag %s", tag)
sep := strings.SplitN(scope, ":", 2)
dir = sep[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that with the change to Split(N), the code that follows

   if strings.Contains(dir, ":") {
           return errors.Errorf("Invalid OCI reference %s: path contains a colon", scope)
   }

Will never be hit - 'dir' cannot contain a ':' - a policy configuration scope of '/foo:bar:baz' is treated as '/foo' with a tag of 'bar:baz'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed.

@owtaylor
Copy link
Contributor

See also the comment about paths with ':' I left at #323 (comment) in response to a comment by @cyphar - it's as relevant to this PR as that one.

@umohnani8
Copy link
Member Author

Fixing skopeo tests containers/skopeo#399

@umohnani8
Copy link
Member Author

@mtrmac PTAL

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 28, 2017

See also the comment about paths with ':' I left at #323 (comment) in response to a comment by @cyphar - it's as relevant to this PR as that one.

Yeah, could more people take a look at that? (Notably, : is a required character in C:\Users\Guest\-like paths on Windows.)

Arguably we’ve been using :, and forbidding : in paths, before this PR, and this PR does not really change anything about that, so it should not be blocked on that decision… but eventually we need to decide whether to handle that, and if so, how.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, including the test coverage, just a couple of last small details.

I don’t think the : conversation is blocking this; though I’d love to hear from @cyphar more details about #326 (comment) :

I think #318 is not doing the right thing.

@@ -177,8 +177,12 @@ func (d *ociImageDestination) PutManifest(m []byte) error {
return err
}

if d.ref.image == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do end up refusing the unnamed images (see the other conversation), it would be nicer to check this already in newDestination, to report this to the user immediately instead of first uploading gigabytes of blob data.

(I don’t think this PR is blocked on the conversation about empty image names; we can merge this PR refusing the empty image names now, and perhaps relax that later.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that this check should be in either NewImageDestination or in newImageDestination?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; by that time we already know that the upload will fail, so we can skip copying blobs if we report this to the user early.

(Both ref.NewImageDestination and newImageDestination work; newImageDestination is a bit better because it is more difficult to unintentionally bypass.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the check in newImageDestinatioin, fixed.

// from being ambiguous with values of PolicyConfigurationIdentity.
if strings.Contains(resolved, ":") {
return nil, errors.Errorf("Invalid OCI reference %s:%s: path %s contains a colon", dir, tag, resolved)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS this check still needs to stay. (Not for the ParseReference caller, where this can’t happen, but for external callers of NewReference.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the check back.

@@ -150,7 +146,7 @@ func (ref ociReference) DockerReference() reference.Named {
// not required/guaranteed that it will be a valid input to Transport().ParseReference().
// Returns "" if configuration identities for these references are not supported.
func (ref ociReference) PolicyConfigurationIdentity() string {
return fmt.Sprintf("%s:%s", ref.resolvedDir, ref.tag)
return fmt.Sprintf("%s", ref.resolvedDir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a comment, perhaps something like

// NOTE: ref.image is not a part of the image identity, because "$dir:$someimage" and "$dir:" may mean the
// same image and the two can’t be statically disambiguated.  Using at least the repository directory is
// less granular but hopefully still useful.

so ensure that someone (e.g. me 😄 ) doesn’t re-add the ref.index in the future without understanding why we had to take it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the comment.

for _, image := range []struct{ suffix, image string }{
{":notlatest:image", "notlatest:image"},
{":latestimage", "latestimage"},
{":", ""},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a {"", ""}, case here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this test case


_, err = NewReference(tmpDir+"/has:colon", tagValue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The has:colon-in-path test should be reinstated after NewReference again starts refusing such paths.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added test back

{"/dir1:notlatest", "/dir1:notlatest"}, // Explicit tag
{"/dir2", "/dir2:latest"}, // Default tag
{"/dir1:notlatest:notlatest", "/dir1:notlatest:notlatest"}, // Explicit image
{"/dir2:image", "/dir2:image"}, // Default image
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually nothing “default” about this :image now that :latest is not treated specially; this line can be deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed default line

}
return *d, nil
}

// LoadManifestDescriptor loads the manifest descriptor to be used to retrieve the image name
// when pulling an image
func LoadManifestDescriptor(imgRef types.ImageReference) (imgspecv1.Descriptor, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have tests for this (both with a known image name and for the "" case), but it is not relevant for signing, so absolutely non-blocking for me.

@umohnani8 umohnani8 force-pushed the oci_name branch 2 times, most recently from c8de29a to ad3c29d Compare August 30, 2017 15:07
Fix oci to save the full image name "image:tag" instead of just the tag in index.json.
Add function to retrieve the image name from index.json when loading or pulling image from oci directory.

Signed-off-by: umohnani8 <umohnani@redhat.com>
@umohnani8
Copy link
Member Author

@mtrmac PTAL

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 31, 2017

👍

Approved with PullApprove

@mtrmac mtrmac merged commit 5216ee0 into containers:master Aug 31, 2017
@mtrmac
Copy link
Collaborator

mtrmac commented Aug 31, 2017

@umohnani8 Please update #399 to use the master branch again, per https://github.com/projectatomic/skopeo/blob/master/README.md#dependencies-management .

@umohnani8
Copy link
Member Author

@mtrmac done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants