Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add FindManifests #828

Merged
merged 1 commit into from
Nov 20, 2020
Merged

add FindManifests #828

merged 1 commit into from
Nov 20, 2020

Conversation

deitch
Copy link
Collaborator

@deitch deitch commented Nov 16, 2020

Add FindManifests utility, that allows one to pass a v1.ImageIndex and a match.Matcher, and get back the v1.Descriptor entries in the index that match.

This is a spin-out from #823 and will include the salient original comments below.

We expect to have several more such utilities that build on Matcher in this PR and/or others before it is done.

@deitch
Copy link
Collaborator Author

deitch commented Nov 16, 2020

Note: this is currently in pkg/v1/v1util, and probably should move elsewhere.

Comments from the original discussion:

@imjasonh wrote:

Re: naming, I don't really love v1/partial or v1/v1util. If we expect there to be a number of provided filters, and we expect them to be used for walking a graph of manifests, perhaps these should go in their own package, e.g., v1/filter or v1/walk?

@jonjohnsonjr wrote:

I'm a little torn here because this is useful both as a helper for packages implementing an ImageIndex (argument for partial) and as a new API for library consumers interacting with an ImageIndex (argument for new package?).
Given your precedent in #797, I'd lean partial, but I hate that package and wish we could get rid of it, honestly.

@deitch wrote (about partial and v1util):

You don't like it in partial, @imjasonh doesn't like it in partial, I don't like it in partial... why put it there?
So let's stay away from it.

Could just go in a utilities package, but isn't that what v1util is?

@jonjohnsonjr wrote:

Put FindManifests in partial (for now).
There's precedent for similar functions.
We will probably want to refactor a bunch of other packages to use it.
I'm not completely convinced this is the final API, and partial being a grab bag of stuff like this makes me feel like it's okay for it to live here for a bit, since it's immediately and obviously useful.

@deitch
Copy link
Collaborator Author

deitch commented Nov 16, 2020

What parts should be in here?

@jonjohnsonjr wrote:

I imagine we'll start with something like these

  one many recursive
Image FindLayer FindLayers WalkLayers
Index FindManifest FindManifests WalkManifests

And end up with something like this, not sure where:

  one many recursive
generic Find Filter Walk

We may want to just skip to the generic bits and do some type-checking (like in remote.MultiWrite) to parse the descriptors appropriately.

Maybe we should rename FindManifests to FilterManifests?

@deitch
Copy link
Collaborator Author

deitch commented Nov 16, 2020

See more comments here and here

@deitch
Copy link
Collaborator Author

deitch commented Nov 16, 2020

So @jonjohnsonjr where does this leave us? What do we want in this PR? Let's ignore the "where" (which package) for now, until the rest of this is ready. Then we can move it all over in one commit change. The API is what we need to work out.

  • Do we want the FindManifests() that takes an ImageIndex as is?
  • Do we want FindLayers() that takes an Image in parallel?
  • Should it be renamed to FindIndexManifests() and FindImageLayers() (or IndexFindManifests() and ImageFindLayers(), or is the current naming sufficient?
  • Do we want the singular FindManifest() and FindLayer()?

I think I view the recursive functionality as potentially another layer; we can and should do it, but it can be a separate PR (since it requires Walk()). Of course, the single-layer (non-recursive) is just a special use case (Walk with depth=1), but it doesn't change the API, so can be adapted afterwards, if need be.

@jonjohnsonjr
Copy link
Collaborator

Let's ignore the "where" (which package) for now, until the rest of this is ready. Then we can move it all over in one commit change. The API is what we need to work out.

I think "where" and the API are somewhat linked. If you are okay with letting this languish for a bit while we experiment and discuss it to death, then I agree we should figure out the API first and then decide on where it should go, but if you're looking to get this merged soon-ish, it might make sense to just put it in partial and punt on thinking too hard about it until later.

I have a really hard time judging an API before I actually try to use it. Assuming I could find the time, I'd want to attempt to refactor various parts of go-containerregistry to use the match package, then attempt to pull out the common abstraction. Until I've done that, I'm going to be a pretty poor reviewer.

I'll throw out some stuff that has jumped to my mind while chewing on this idea, though:

implementation details vs external consumers

One reason I keep coming back to partial is that, as is, this really would be useful for many of our pkg/v1 implementations. We often iterate over manifests/layers to match some subset, so I'd want to try to refactor those to use any proposed API before we merge it. One place this would seem especially useful is remote.MultiWrite cc @imjasonh

On the other hand, we're also exposing an API for consumers of this library (e.g. you). As I described in the README (and mutate README, I more or less expect consumers to stick to methods on the v1 interfaces, the mutate package, and (rarely) the partial package. Sticking this stuff in partial demotes it a little bit, but I'm not 100% convinced that this interface has reached its "final" form, which leads me to my next thought...

bridging the descriptor graph

So the current API (returning v1.Descriptors) is really useful for code dealing with descriptors (often our own packages), but I don't expect consumers to be dealing with descriptors that often.

Let's consider the layout package. I have an index.json that contains two descriptors, one for ubuntu and one for debian, both of which are multi-platform indexes pointing to amd64 and s390x images:

image

As a motivating example, let's say I want to compare all the s390x images in this layout. How do we do this? Ignoring errors...

func findManifests(root v1.ImageIndex, p v1.Platform) []v1.Image {
  matches := []v1.Image{}

  m, _ := root.IndexManifest()
  for _, desc := range m.Manifests {
    idx, _ := root.ImageIndex(desc.Digest)
    for _, desc := range partial.FindManifests(idx, match.Platform(p)) {
      img, _ := idx.Image(desc.Digest)
      matches = append(matches, img)
    }
  }

  return matches
}

Since I want access to the image layers, config, etc., I need to convert the v1.Descriptors I get back from FindManifests into v1.Images. To do that, I need access to their parent v1.ImageIndex so I can call Image(Hash) on that. This isn't too bad, given that there's really just one level of nesting that I care about, but it would be really nice if you could instead do FindImages:

func findImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
  matches := []v1.Image{}

  m, _ := root.IndexManifest()
  for _, desc := range m.Manifests {
    idx, _ := root.ImageIndex(desc.Digest)
    imgs := partial.FindImages(idx, match.Platform(p))
    matches = append(matches, imgs...)
  }

  return matches
}

Alternatively, we could do something like remote.Descriptor where you get back something that can be "cast" into a v1.Image or v1.ImageIndex but contains all the v1.Descriptor fields:

func findImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
  matches := []v1.Image{}

  m, _ := root.IndexManifest()
  for _, desc := range m.Manifests {
    idx, _ := root.ImageIndex(desc.Digest)
    for _, desc := range partial.FindManifests(idx, match.Platform(p)) {
      matches = append(matches, desc.Image())
    }
  }

  return matches
}

That improves on things a lot. If we do something like this, implementing a recursive version makes most of this code go away:

func walkImages(root v1.ImageIndex, p v1.Platform) []v1.Image {
  matches := []v1.Image{}
  for _, desc := range partial.WalkManifests(root, match.Platform(p)) {
      matches = append(matches, desc.Image())
  }
  return matches
}

I think the crux of the issue is that our v1 interfaces are of multiple types: v1.Image, v1.ImageIndex, and v1.Layer; but we are dealing with just a single pointer type of v1.Descriptor, and we don't have a good way to go from v1.Descriptor back to the interface universe unless you re-traverse through everything that FindWhatever or WalkWhatever have already traversed.

strawman

A rough stab at something that would make sense as a result of FindFoo:

// TODO: Name it, embed into remote.Descriptor, use anywhere there's ambiguity.
type v1.Either interface {
  Image() (v1.Image, error)
  ImageIndex() (v1.ImageIndex, error)
}

// TODO: Name it something better.
type Needle struct {
  // embed v1.Descriptor like remote.Descriptor does
  v1.Descriptor

  // Allow callers to re-interpret as an Image or ImageIndex.
  v1.Either

  
  // Optional section that we could add later.

  // This seems like it would very often be very useful (root == nil).
  Parent() *Needle

  // Perhaps add ways to access anything that a descriptor can point to?
  Layer() (v1.Layer, error)
  Blob() (io.ReadCloser, error)
} 

This isn't an "obviously correct" abstraction to me, so I really hesitate to committing to this, but it does satisfy most of the implementation considerations that we've discussed. Please be very critical here, as I'm sure it can be improved.

Do we want the FindManifests() that takes an ImageIndex as is?
Should it be renamed to FindIndexManifests() and FindImageLayers() (or IndexFindManifests() and ImageFindLayers(), or is the current naming sufficient?

I think it is fine as is if we are willing to let it live on the Island of Misfit Toys.

Do we want FindLayers() that takes an Image in parallel?
Do we want the singular FindManifest() and FindLayer()?

Let's hold off until you or I actually need it.

recursive ... it can be a separate PR

Definitely, but we should attempt to figure out the interface for that in this PR if we want a consistent experience.

I feel like this is two separate conversations kind of intertwined: there's "what is our ideal API for this kind of thing" vs "what is reasonable/useful to merge now".

@deitch
Copy link
Collaborator Author

deitch commented Nov 17, 2020

I think "where" and the API are somewhat linked. If you are okay with letting this languish for a bit while we experiment and discuss it to death, then I agree we should figure out the API first and then decide on where it should go, but if you're looking to get this merged soon-ish, it might make sense to just put it in partial and punt on thinking too hard about it until later.

OK, that is a pretty good (practical) argument. I will move whatever we do here into partial with the next commit.

@deitch
Copy link
Collaborator Author

deitch commented Nov 17, 2020

I feel like this is two separate conversations kind of intertwined: there's "what is our ideal API for this kind of thing" vs "what is reasonable/useful to merge now".

Heh, so do I. It probably actually is 3 or more, but I can barely keep the two in my head. I know of left-brain and right-brain, but forward-brain, mid-brain, back-brain, up-brain, down-brain, strange-brain, charmed-brain, and Higgs-brain are too much for me.

but it would be really nice if you could instead do FindImages

Agreed totally. I actually saw FindManifests() as a lower-level primitive, that, in turn, gets wrapped by FindImages():

// FindManifests given a v1.ImageIndex, find the manifests that fit the matcher
func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error) {
        // get the actual manifest list
        indexManifest, err := index.IndexManifest()
        if err != nil {
                return nil, fmt.Errorf("unable to get raw index: %v", err)
        }
        manifests := []v1.Descriptor{}
        // try to get the root of our image
        for _, manifest := range indexManifest.Manifests {
                if matcher(manifest) {
                        manifests = append(manifests, manifest)
                }
        }
        return manifests, nil
}

// FindImages given a v1.ImageIndex, find the images that fit the matcher
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error) {
       matches := []v1.Image{}
        manifests, err := FindManifests(index, matcher)
        if err != nil {
            return nil, err
        }
        for _, desc := range manifests {
            img, _ := index.Image(desc.Digest)
            matches = append(matches, img)
        }
       return matches, nil
}

Actually, I will add the above to this PR-in-progress, just to see how it looks/feels.

This brings us to your "wrapper", which has the signature func findManifests(root v1.ImageIndex, p v1.Platform) []v1.Image, i.e. find-by-platform, and not find-by-any-matcher. I like those "common-use-case" simplifications as well, but what is the signature? golang doesn't have overloading, so we cannot have FindManifests(ImageIndex, Platform) and FindManifests(ImageIndex, Matcher).

Alternatively, we could do something like remote.Descriptor where you get back something that can be "cast" into a v1.Image or v1.ImageIndex but contains all the v1.Descriptor fields

I like the idea that there is a uniform interface between layout and remote. Whether it is an actual interface or just how it behaves, I like the similar functioning.

This does become the "second conversation" you raised above.

That improves on things a lot. If we do something like this, implementing a recursive version makes most of this code go away:

Not really. We cannot use just partial.WalkManifests(root, match.Platform(p)), since the matcher might be different at different layers. E.g. I might have an index that I select manifest by name, which leads to an index that I select by some other annotation, which leads to an index that I select by architecture.

I think my point is that while a generic walker can make sense, a generic matcher across those doesn't appear to make sense, because the very reason you have multiple layers of indexes (really, indices) is because each one serves a different aggregation upwards (and therefore filtering downwards) purpose. Without the context at each layer, selecting becomes challenging.

@deitch
Copy link
Collaborator Author

deitch commented Nov 17, 2020

Added func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error) to this PR

@deitch
Copy link
Collaborator Author

deitch commented Nov 17, 2020

I think the crux of the issue is that our v1 interfaces are of multiple types: v1.Image, v1.ImageIndex, and v1.Layer; but we are dealing with just a single pointer type of v1.Descriptor, and we don't have a good way to go from v1.Descriptor back to the interface universe unless you re-traverse through everything that FindWhatever or WalkWhatever have already traversed.

Do we want to open a new PR in parallel to deal with that? One that has this interface v1.Either (or a better name), retrofits it into remote.Descriptor (which doesn't need any changes other than some comments), and adds it to layout? Keep the conversations separate, as they eventually merge on the return values of Find*()? We can get this interface mostly right, that one right, then get that one in and update this one quickly?

@codecov-io
Copy link

codecov-io commented Nov 17, 2020

Codecov Report

Merging #828 (5224d45) into master (3904ad8) will decrease coverage by 0.06%.
The diff coverage is 67.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #828      +/-   ##
==========================================
- Coverage   74.83%   74.77%   -0.07%     
==========================================
  Files         105      106       +1     
  Lines        4383     4420      +37     
==========================================
+ Hits         3280     3305      +25     
- Misses        620      626       +6     
- Partials      483      489       +6     
Impacted Files Coverage Δ
pkg/v1/partial/index.go 58.62% <58.62%> (ø)
pkg/v1/types/types.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3904ad8...5224d45. Read the comment docs.

@jonjohnsonjr
Copy link
Collaborator

I know of left-brain and right-brain, but forward-brain, mid-brain, back-brain, up-brain, down-brain, strange-brain, charmed-brain, and Higgs-brain are too much for me.

Just need to invent time travel so I can experiment with the consequences of merging this PR before I merge it.

Actually, I will add the above to this PR-in-progress, just to see how it looks/feels.

Now we're cooking with fire.

I like the idea that there is a uniform interface between layout and remote. Whether it is an actual interface or just how it behaves, I like the similar functioning.

Yeah this might be the practical thing to do, but let me think about how these things might compose or be interchangeable before we settle on a signature.

E.g. I might have an index that I select manifest by name, which leads to an index that I select by some other annotation, which leads to an index that I select by architecture.

I'd actually be interested to see if we can express this requirement as a matcher (or something similar) in the type system. From a readability perspective, loops and stuff might be better, but I'm imagining some API where I can pass a single matcher and it's more efficient to let the implementation apply the matcher than to do multiple passes (this feels unlikely to me given how we've carefully made most of these packages do lazy access to everything, but it's something to consider).

I think my point is that while a generic walker can make sense, a generic matcher across those doesn't appear to make sense, because the very reason you have multiple layers of indexes (really, indices) is because each one serves a different aggregation upwards (and therefore filtering downwards) purpose. Without the context at each layer, selecting becomes challenging.

This is a good point, and maybe where my Needle would be useful for walking so you can look upwards to get context. I wish we could write a matcher that worked against both Needle and v1.Descriptor... alas, golang will not allow it, and it's probably not worth the complexity. I really like the simplicity of func(v1.Descriptor) bool.

the very reason you have multiple layers of indexes (really, indices) is because each one serves a different aggregation upwards (and therefore filtering downwards) purpose

I don't know if this is necessarily true. You might have a "bag" of images that some downstream consumer client will know how to interpret (by doing platform resolution), but there isn't a strict structure. E.g. maybe some of your top-level index.json descriptors point to images and some indexes. I do agree though, and something like this shouldn't be exposed to users because it's somewhat of a footgun.

(really, indices)

Yeah, I am really torn on this and have started writing "indexes" in this context for a couple reasons:

  1. An Image Index here feels a lot like a proper noun and less like an actual index, which I want to convey by keeping that string intact (e.g. if you want to ctrl+f "index") and visually identifiable, but in conversation I'll usually say "indices".
  2. I think that "indices" is more likely to be confusing, especially to non-native English speakers, even if it's more "correct".

Also, enjoy/suffer from the inconsistency even within the spec:
https://github.com/opencontainers/image-spec/search?q=indices
https://github.com/opencontainers/image-spec/search?q=indexes

A younger version of me would be horrified by the concessions of pedantry I've made for pragmatism.

Do we want to open a new PR in parallel to deal with that? One that has this interface v1.Either (or a better name), retrofits it into remote.Descriptor (which doesn't need any changes other than some comments), and adds it to layout? Keep the conversations separate, as they eventually merge on the return values of Find*()? We can get this interface mostly right, that one right, then get that one in and update this one quickly?

Yeah if you want to open a draft PR (assuming I haven't by the time you wake up) to just contain this long-winded conversation, we can then refocus this PR.

return nil, err
}
for _, desc := range manifests {
img, _ := index.Image(desc.Digest)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to handle this error (I know this got copied from me 😄).

What should we do if a descriptor matches but points to a non-image artifact?

  1. Let this error out.
  2. Ignore non-Image media types.
  3. Return an error for non-Image media types.

The second option makes me want to implement composition so we can:

FindManifests(index, match.And(match.MediaTypes(types.Images), matcher))

But this might run afoul of the user intent -- they could just supply this matcher themselves! Should we make them? Or is that the whole point of this helper function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working really hard to avoid composition here. I don't object to it later, but I am focusing on simple; easy for the user to understand use cases, not to mention easy for me to grasp as we work these through.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's think of it practically. I say, "give me all of the images from this index of platform X" or "give me all of the images from here of name Y". I don't really care about the fact that 7 of the 10 manifests do not point to images, I want those 3. The very calling of FindImages implies "it is images I want". I would ignore anything that errors because it is not an image.

As this is work-in-progress, I am going to add that handling in.

By the way, is there a types.Images? I didn't see that. Or were you suggesting it for the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working really hard to avoid composition here. I don't object to it later, but I am focusing on simple; easy for the user to understand use cases, not to mention easy for me to grasp as we work these through.

Yeah that's fine, I don't want to commit to an API either. We could just hardcode some logic in here to check for the right media type (you could use a matcher to simplify) instead of using composition.

I would ignore anything that errors because it is not an image.

I don't love this -- index.Image(hash) could fail for all kinds of reasons. Swallowing them feels dangerous.

By the way, is there a types.Images? I didn't see that. Or were you suggesting it for the future.

Not yet, just theoretical shorthand for a group of roughly equivalent media types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this -- index.Image(hash) could fail for all kinds of reasons. Swallowing them feels dangerous.

Try this; I just updated to use a custom error and check it with errors.Is()

Copy link
Collaborator Author

@deitch deitch Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also used that custom error in v1/remote and v1/layout, and documentation on v1.ImageIndex interface

@deitch
Copy link
Collaborator Author

deitch commented Nov 17, 2020

Just need to invent time travel so I can experiment with the consequences of merging this PR before I merge it.

If we could invent time travel, about the last thing I would use it for is figuring out PRs. I wouldn't quite, say, stop WWII, as tempting as that would be - the law of unintended consequences is enormous there - but I can see some smaller scale of good, as well as buying a lot of APPL shares a 20 years ago, when they nearly were gone. Do good and do well.

On that note, there was a great original series Star Trek episode about exactly that, "The City on the Edge of Forever".

Yeah this might be the practical thing to do, but let me think about how these things might compose or be interchangeable before we settle on a signature.

Sure. I don't think we need to answer that in this PR. It is one (or maybe two) past it.

This is a good point, and maybe where my Needle would be useful for walking so you can look upwards to get context. I wish we could write a matcher that worked against both Needle and v1.Descriptor... alas, golang will not allow it, and it's probably not worth the complexity. I really like the simplicity of func(v1.Descriptor) bool.

I honestly think it is too complex, but I could be wrong. I am trying to build upwards, and I see that as beyond practical. But, again, happy to be proven wrong. I think that, too, is another layer or so past this. We still are on building blocks with this one.

I don't know if this is necessarily true. You might have a "bag" of images that some downstream consumer client will know how to interpret (by doing platform resolution), but there isn't a strict structure. E.g. maybe some of your top-level index.json descriptors point to images and some indexes. I do agree though, and something like this shouldn't be exposed to users because it's somewhat of a footgun.

It isn't spec-enforced true, but I think it is how it is used. I like the "footgun" analogy, though.

horrified by the concessions of pedantry I've made for pragmatism

Heh, my daughter just walked in and asked if she used the term "panicking" correctly, or if her teacher is correct. Unfortunately, for her, the teacher was. In this case, I am more than willing to be pedantic.

Yeah if you want to open a draft PR (assuming I haven't by the time you wake up) to just contain this long-winded conversation, we can then refocus this PR.

I will try. One or the other of us will do it first and @ cc the other.

@deitch deitch force-pushed the find-index branch 2 times, most recently from 1f41547 to 221d99b Compare November 18, 2020 08:26
@deitch
Copy link
Collaborator Author

deitch commented Nov 18, 2020

So far, this PR does the following:

  • adds the custom NotImageError and NotIndexError errors and uses them
  • adds the following two fund:
func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error)
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error)

The open questions remain. What do we need to add to this PR to move it ahead?

  • what other functions do we need that parallel those? FindImages() is just a wrapper around FindManifests(), which is a lower-level building block. I can see a FindIndexes() as well (and I am happy to alias it to FindIndices(), if we want 😁).
  • what about the simplified suggestions you had above, e.g. FindImagesByName(ImageIndex, string) or FindImagesByPlatform(ImageIndex, Platform) or those? They are just convenient wrappers around those, but still could be useful. Or is that too much?

E.g.

func FindImagesByPlatform(index ImageIndex, p Platform) ([]v1.Image, error) {
    return FindImages(index, match.Platform(p))
}
func FindImagesByName(index ImageIndex, name string) ([]v1.Image, error) {
    return FindImages(index, match.Name(name))
}

func FindIndexesByPlatform(index ImageIndex, p Platform) ([]v1.ImageIndex, error) {
    return FindIndexes(index, match.Platform(p))
}
func FindIndexesByName(index ImageIndex, name string) ([]v1.ImageIndex, error) {
    return FindIndexes(index, match.Name(name))
}

And do we need the single versions of those?

func FindManifests(index v1.ImageIndex, matcher match.Matcher) ([]v1.Descriptor, error)
func FindImages(index v1.ImageIndex, matcher match.Matcher) ([]v1.Image, error)

func FindManifest(index v1.ImageIndex, matcher match.Matcher) (v1.Descriptor, error) {
    manifests, err := FindManifests(index, matcher)
    if err != nil {
      return nil, err
    }
    if len(manifests) > 0 {
      return manifests[0], nil
    }
    return nil, nil
}
func FindImage(index v1.ImageIndex, matcher match.Matcher) (v1.Image, error) {
   // etc.
}
func FindIndex(index v1.ImageIndex, matcher match.Matcher) (v1.ImageIndex, error) {
}
func FindImageByPlatform(index ImageIndex, p Platform) (v1.Image, error) {
    return FindImage(index, match.Platform(p))
}
func FindImageByName(index ImageIndex, name string) (v1.Image, error) {
    return FindImage(index, match.Name(name))
}

func FindIndexByPlatform(index ImageIndex, p Platform) (v1.ImageIndex, error) {
    return FindIndex(index, match.Platform(p))
}
func FindIndexByName(index ImageIndex, name string) (v1.ImageIndex, error) {
    return FindIndex(index, match.Name(name))
}

@jonjohnsonjr
Copy link
Collaborator

adds the custom NotImageError and NotIndexError errors and uses them

This doesn't feel quite right to me. FindImages knows that calling Image will fail because it can see the descriptor's media type. We don't need these errors to signal this, and I'm worried we will forget to add these errors in certain places unless we centralize some of this logic.

what other functions do we need that parallel those

I'd say FindIndexes and FindLayers both make sense here, but I'd be okay with holding off on that until there's an obvious use case.

FindImagesByName ...

I don't think we'd want these given how well FindImages and FindManifests compose with the match package. Perhaps we can leave some breadcrumbs to document this idea, but it seems silly to duplicate.

And do we need the single versions of those?

I'd just leave it out for now. Given that we don't recurse and just iterate over one level, taking the 0th element of the result seems totally reasonable to me. If someone complains this is slow for their 10,000 entry index.json, we can come back to it.

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

I'd just leave it out for now. Given that we don't recurse and just iterate over one level, taking the 0th element of the result seems totally reasonable to me. If someone complains this is slow for their 10,000 entry index.json, we can come back to it.

If we find someone using this for a 10,000 entry index.json, we need to send them to containerd. 😁

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

I'd say FindIndexes and FindLayers both make sense here, but I'd be okay with holding off on that until there's an obvious use case.

OK, I added in FindIndexes with that signature.

What is FindLayers? I can see how it makes sense if I am starting with a v1.Image, but with one of those I can just call image.Layers(). If I am starting with a v1.ImageIndex, why wouldn't I just do index.Image(hash) or FindImages(index, matcher) and then get my v1.Image and call Layers().

Unless you are thinking it is useful for when I have a v1.Image, and I want Layers, but I don't want all of them? I want to filter by mediaType or annotations or some other field on the descriptor for that layer, in the manifest for the image? Is that it?

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

I don't think we'd want these given how well FindImages and FindManifests compose with the match package. Perhaps we can leave some breadcrumbs to document this idea, but it seems silly to duplicate.

I never argue with less work. With the present ones, we have the capabilities, so those would just be convenience. If we find ourselves or others regularly doing boilerplate, we can worry about adding them later.

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

FindImages knows that calling Image will fail because it can see the descriptor's media type. We don't need these errors to signal this

Oh, so you are saying, inside FindImages() and FindIndexes(), before even doing index.Image(desc.Digest) or index.ImageIndex(desc.Digest), first check desc.MediaType to see if it is appropriate. If not, don't even bother checking it. I like that, much more elegant.

I'm worried we will forget to add these errors in certain places unless we centralize some of this logic.

So was I. I actually thought about having both of those functions v1.ImageIndex.Image(hash) and v1.ImageIndex.ImageIndex(hash) return a custom error interface that implements error, so that you had to return some custom error.

But I will admit that just checking inside the funcs avoids the issue altogether.

Updating soon.

@deitch deitch force-pushed the find-index branch 2 times, most recently from a453c0a to f06d938 Compare November 19, 2020 09:20
@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

OK, have another look.

  • reverted all of the custom errors, and comments around them
  • added IsImage() and IsIndex() to types.MediaType
  • used IsImage() in FindImages(); and IsIndex() in FindIndexes()

This looks much cleaner, I must say

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

Rebased on latest

@jonjohnsonjr
Copy link
Collaborator

Unless you are thinking it is useful for when I have a v1.Image, and I want Layers, but I don't want all of them? I want to filter by mediaType or annotations or some other field on the descriptor for that layer, in the manifest for the image? Is that it?

Yes exactly. It's not really that useful, but the type signatures line up and it follows the pattern. If I find myself really wanting FindLayers, I can just implement it myself. The only use case I can really imagine is filtering out non-distributable layers when pushing images.

This looks much cleaner, I must say

+1 👍

I'll probably refactor some stuff to use these in a few places 😄

Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a couple nits and tests 👍

pkg/v1/partial/index.go Outdated Show resolved Hide resolved
pkg/v1/partial/index.go Outdated Show resolved Hide resolved
pkg/v1/types/types.go Outdated Show resolved Hide resolved
pkg/v1/types/types.go Outdated Show resolved Hide resolved
@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

I accepted all of your changes, but it then turned it into 4 new commits in addition to the base one. We probably should squash them down. And if I could figure out how to pull it back locally, I would.

@jonjohnsonjr
Copy link
Collaborator

And if I could figure out how to pull it back locally, I would.

Hah, that's fine. They're trivial changes, so I'd just force push over them probably.

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

OK got it.

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

What else do you think we need here @jonjohnsonjr ?

@jonjohnsonjr
Copy link
Collaborator

jonjohnsonjr commented Nov 19, 2020 via email

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

I think just some test coverage and it'll be ready for merge.

Now you're really pushing your luck! 😂

@jonjohnsonjr
Copy link
Collaborator

I am the Lorax! I speak for the tests. I speak for the tests, for the tests have no tongues.

image

This is slowly dropping as I get lazier in code review so I'm trying to be less lazy about it 😜

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

yeah, but it is such a pain, since you need to construct actual v1.Image and v1.ImageIndex to work with FindImages and FindIndexes. We have the empty package and mutate, but you need to provide a lot of info for each.

@jonjohnsonjr
Copy link
Collaborator

random.Index and random.Image are you friends here.

@deitch
Copy link
Collaborator Author

deitch commented Nov 19, 2020

This line here:

		manifest.Manifests = append(manifest.Manifests, v1.Descriptor{
			Digest:    digest,
			Size:      size,
			MediaType: mediaType,
		})

I need to change it for each manifest, so I have what to match against. It needs platform or annotations, etc.

I cannot quite pass it in, since that would mess up the signature (and it needs to be different for each one). I think we need a way to modify them?

@jonjohnsonjr
Copy link
Collaborator

jonjohnsonjr commented Nov 19, 2020

I need to change it for each manifest, so I have what to match against. It needs platform or annotations, etc.

Not really, you just need any match.Matcher. You could use func(_ v1.Descriptor) bool { return true }.

If you want to make sure it "really works", I'd do something super lazy like:

m, _ := idx.IndexManifest()
digest := m.Manifests[0].Digest

matcher := func(desc v1.Descriptor) bool {
  return desc.Digest != digest
} 

And confirm that len(FindImages(idx, matcher)) == len(m.Manifests) - 1.

@deitch
Copy link
Collaborator Author

deitch commented Nov 20, 2020

I had to do some strange construction to get TestFindIndexes() to work, but it did.

We have random.Image() that creates an image, and random.Index() that creates a v1.ImageIndex with Image children, but not with Index children (or a mix).

Signed-off-by: Avi Deitcher <avi@deitcher.net>
Co-authored-by: jonjohnsonjr <jonjohnson@google.com>
Signed-off-by: Avi Deitcher <avi@deitcher.net>
@deitch
Copy link
Collaborator Author

deitch commented Nov 20, 2020

Thinking about #835 and this, @jonjohnsonjr . Once that is in, we could have the following:

func FindResolvable(index ImageIndex, matcher match.Matcher) ([]v1.ResolvableDescriptor, error)

And then you end up with the four:

  • FindManifests() - lower-level, just get the v1.Manifest
  • FindImages() - if you know what you are looking for is a v1.Image
  • FindIndexes() - if you know what you are looking for is a v1.ImageIndex
  • FindResolvable() - if you know you want something, but do not yet know what it is

The last one closes it out nicely, and is the equivalent of remote.Get() for any v1.ImageIndex. I know I want by some criterion (e.g. name, or annotation, or platform), but I don't yet know what it will bring back to me.

@deitch
Copy link
Collaborator Author

deitch commented Nov 20, 2020

Oh good, CI is green.

@jonjohnsonjr
Copy link
Collaborator

FindResolvable

Not sure about the name but I agree with idea.

Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jonjohnsonjr jonjohnsonjr merged commit 93b58dc into google:master Nov 20, 2020
@deitch deitch deleted the find-index branch November 21, 2020 16:52
@deitch
Copy link
Collaborator Author

deitch commented Nov 21, 2020

Not sure about the name but I agree with idea.

Neither am I. Let's move this discussion over to #835 , which is where I took the name from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants