Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add + as a valid tag character #154

Open
JeremyLoy opened this issue May 27, 2020 · 15 comments
Open

Add + as a valid tag character #154

JeremyLoy opened this issue May 27, 2020 · 15 comments
Milestone

Comments

@JeremyLoy
Copy link

JeremyLoy commented May 27, 2020

The OCI specification lacks a strict definition of valid tags.

I believe that image tags should support the full semver specification. To be clear, I am not saying all tags must be valid semver, I am merely stating that a valid semver should also be a valid tag.

There are some technical limitations on the Docker/Moby side, as the + character is not allowed. Hopefully ratifying the addition (upstream? downstream? not exactly sure of the politics here) will spur this along. The issue has been open for 5 years with no progress.

https://semver.org
distribution/distribution#1201
distribution/distribution#1202 (comment)

@hallyn
Copy link
Contributor

hallyn commented May 28, 2020

Sorry, could you point to where in the spec.md right now the supported tags are enumerated? I took a quic klook but did not find it.

@JeremyLoy
Copy link
Author

I don’t believe it’s defined at all. I’m hoping that this issue will track the inclusion of the tag specification, and that it should allow for semver strings.

@amouat
Copy link
Contributor

amouat commented May 28, 2020

This is a good point, but maybe the title is a bit misleading - what you're really asking is that tags support the "+" character, right?

@JeremyLoy
Copy link
Author

JeremyLoy commented May 28, 2020

Yes and no. Lack of support for the + character is a limitation of Docker/Moby. With that sole modification, it’s possible for a Docker tag to be a valid semver string.

This issue is really two fold

  1. define what a valid tag is for OCI compliance
  2. suggesting that the tag specification allows for semver strings

Right now since there is no specification, nothing is preventing an implementation that lacks support for . or v etc.

@JeremyLoy
Copy link
Author

I guess a better title would be “define a specification for tags” I’ll make it so

@JeremyLoy JeremyLoy changed the title Support Semantic Versioning (semver) spec for tags Define the Specification for Tags May 28, 2020
@amouat
Copy link
Contributor

amouat commented May 28, 2020

To gather more context, the image spec does talk a little about tags. In particular note that https://github.com/opencontainers/image-spec/blob/e562b04403929d582d449ae5386ff79dd7961a11/image-layout.md#indexjson-file mentions

A common use case of descriptors with a "org.opencontainers.image.ref.name" annotation is representing a "tag" for a container image

The description for this annotation is intriguing:

org.opencontainers.image.ref.name Name of the reference for a target (string).

SHOULD only be considered valid when on descriptors on index.json within image layout.
Character set of the value SHOULD conform to alphanum of A-Za-z0-9 and separator set of -._:@/+
The reference must match the following grammar:

ref       ::= component ("/" component)*
component ::= alphanum (separator alphanum)*
alphanum  ::= [A-Za-z0-9]+
separator ::= [-._:@+] | "--"

https://github.com/opencontainers/image-spec/blob/master/annotations.md

I don't really follow the intent of this at all. It seems to be representing the full image name, including repo, rather than just the "tag" as it allows "components" and the ':' character. This more or less contradicts the first link.

And it gets even stranger if you consider that "middle" components can validly contain things like ":" e.g. myrepo:something/mytag:1.0. (I wonder if that's to allow myrepo:8443/bla/bla/name:tag)

@dmcgowan
Copy link
Member

dmcgowan commented Jun 2, 2020

@JeremyLoy the original issue was stalled not on politics, but rather a full assessment done on the impact and a recommendation for how to change the spec. It is related to + as that has special meaning during the interpretation of the URL. Clients and servers must account for this otherwise tags will get rejected or confused by containing spaces or '%' characters. It may be possible to move forward with this by defining how it should be interpreted in the specification and tested for in the conformance testing. If we moved forward with defining a strict definition of a tag for 1.0 without this effort, then + would not be a part of it and make semver forever backwards incompatible with 1.0. So I would change the definition to mention this is about supporting +, I assume a strict definition is not what you want unless someone is going to volunteer to champion the + support 😄

@jd3nn1s
Copy link

jd3nn1s commented May 6, 2023

It'd be good to see if we could get an effort to support + in tags going again. I see in the specification that tags in references are now formally defined via a regular expression in the spec.

The regex today is: [a-zA-Z0-9_][a-zA-Z0-9._-]{0,127}. To be able to support all valid semantic versions this would be modified to add +: [a-zA-Z0-9_][a-zA-Z0-9+._-]{0,127}

<reference> is only used when defining paths in URLs apart from when it itself is defined. Therefore we can amend the definition to include guidance on encoding.

+ is a URI sub-component delimiter (sub-delims) as defined in RFC 3986 section 2.2. Additionally, section 3.3 states that path segments can include sub-delims without percent encoding so it is accurate to say no encoding or special handling of + is required. Bugs do exist out in the wild (like S3 treats + in path as a space) but I think these should be handled by the various implementations and not influence the API specification.

@jzelinskie
Copy link
Member

As a maintainer, I give this a huge +1.

The vast majority of folks are using tags for versions of their software and semver is quite common. Not being able to satisfy that is a huge failure of the standard, IMO.

What would it take to kick the tires here? I think it needs some eyeballs from the folks working on the runtime.

@tianon
Copy link
Member

tianon commented May 13, 2023

We probably need eyes from around the whole ecosystem, right? I imagine the original distribution regex for this has been cargo culted into a number of places. 😬

(Maybe from the spec perspective this ends up with the current regex defined as MUST and + as just SHOULD?)

@jd3nn1s
Copy link

jd3nn1s commented May 16, 2023

As supporting + would be additive rather than breaking I think it would be better to require it (aka MUST) in a new version of the spec. Otherwise it will always be hit or miss whether it can be used, and in an ecosystem of tools it only takes one problem to de-rail a workflow.

@jd3nn1s
Copy link

jd3nn1s commented May 16, 2023

I missed the Listing tags API. It can take a tag name as a query param, which would need percent-encoding to avoid being interpreted as a space.

I do wonder about backward compatibility on this API. If a registry starts returning tag values that are invalid for existing clients then bad things might happen. For instance, imagine a script that gets all tags in a registry and then iterates through them running Docker commands. The tag with a + in it would cause Docker to fail and the script to potentially abort.

Another option is to rev the API version so that v2 list omits any tags with + in them while v3 list returns all tags. Having some tags missing from the v2 list might be a bit spooky for some consumers.

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented May 16, 2023

I'm hesitant on this because it doesn't feel like there's enough new functionality to warrant a breaking change. In particular, this would still be pushing the same image content, just a slightly different name.

Otherwise it will always be hit or miss whether it can be used, and in an ecosystem of tools it only takes one problem to de-rail a workflow.

There's no way to know that all the tools receiving a specific tag are on the latest OCI spec. So something like a runtime or vulnerability scanner may not be able to pull images with the + in the tag. The most an image producer would be able to know is whether they could push the image to the first registry, but not any mirrors or downstream consumers.

Users can insist on having OCI vn.n.n support in their tooling. However, there's no conformance test for client tooling, only registry servers. And downstream consumers may be different organizations from the producers, e.g. an open source developer that has no visibility into who is pulling their images and how they are running them.

@jzelinskie
Copy link
Member

jzelinskie commented May 22, 2023

So something like a runtime or vulnerability scanner may not be able to pull images with the + in the tag.

I would hope that security critical tools like this wouldn't fail open, but I think it's a valid point that there's not a good way to move forward starting at the spec.

This should probably be tagged in a v2 milestone because it's a breaking change and something we don't want to forget once we commit to start breaking.

@sudo-bmitch sudo-bmitch modified the milestones: v1.0.0-next, v2.0.0 May 22, 2023
@sudo-bmitch
Copy link
Contributor

This should probably be tagged in a v2 milestone because it's a breaking change and something we don't want to forget once we commit to start breaking.

Done.

@sudo-bmitch sudo-bmitch changed the title Define the Specification for Tags Add + as a valid tag character Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants