Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registrations in IANA URN Namespaces registry and Media Types registry #36

Open
martin-kuba opened this issue Feb 22, 2022 · 18 comments
Open
Assignees

Comments

@martin-kuba
Copy link

The DURI Passport group is working on a new version of the GA4GH Passport specification that will define a new type of token for Passport following RFC 8693 OAuth 2.0 Token Exchange.

A definition of a new type of token requires definition of two things:

While any URL is a URI, so a URL like https://ga4gh.org/uri/passport-token could be used, it would look better to use a URN with the prefix urn:ga4gh which needs to be registered in the IANA URN Namespaces registry.

A media type identifier like application/ga4gh_passport+jwt indicating a specific format of JWT could help in distinguishing different types of tokens. (The idea is taken from RFC 9068 JSON Web Token Profile for OAuth 2.0 Access Tokens that defines media type application/at+jwt for JWTs representing access tokens.) A new media type needs to be registered in IANA Media Types registry.

These registrations should be circulated throughout the GA4GH to make sure it's done in a way that all work streams can make use of.

@martin-kuba
Copy link
Author

Maybe the media type identifier could be hierarchical, like application/passport+ga4gh+jwt for passport tokens and application/workorder+ga4gh+jwt for work order tokens and so on. The DURI Passport group wants to define at least two types of tokens.

@jb-adams
Copy link
Member

jb-adams commented Mar 7, 2022

This looks good. We could also consider adding a component for the name of the specification, e.g. application/workorder+passport+ga4gh+jwt so that each working group within GA4GH has their own mini "namespace." But this is probably not necessary and might end up looking messy

@jmarshall
Copy link
Member

jmarshall commented Mar 7, 2022

The Wikipedia page on media types gives a good overview of their syntax. The complete rules and details are in RFC 6838. Of those media types currently registered (approx. 2000), very few use “_” and none contain more than one “+” character.

Standards tree media types generally represent relatively general-purpose formats and concepts. Thus if these passports are of general interest to the internet community, then a media type like application/passport+jwt could be appropriate. (There are currently four particular-JWT-flavour types like application/XYZ+jwt registered alongside the vanilla application/jwt.)

If OTOH these are of interest only to people implementing GA4GH protocols, then a vendor tree media type like application/vnd.ga4gh.passport+jwt would probably be more appropriate.

So that's the sort of naming patterns these things should follow. The other question is how many media types really need to be distinguished for these purposes. IMHO what this comes down to is what a client would do differently when informed of these proposed more particular media types:

  • Does a client do different things given a passport or a work order, or are the formats substantially the same with a few different fields or field values?
  • Is a passport perhaps a particular GA4GH-orientated form of access token? Could it use the existing application/at+jwt for example?

@martin-kuba
Copy link
Author

The application/vnd.ga4gh.passport+jwt media type looks appropriate to me, thanks for finding it.

GA4GH passport token and work-order token are not OAuth access tokens, so at+jwt is not appropriate for them. Passport is a list of all visas, thus lists everything that is known about a user and everything that the user can access. Work-order token will be very limited to the minimal set of access rights needed for a workflow, minimizing the risk of stealing sensitive data.

@ianfore
Copy link

ianfore commented Mar 8, 2022

As noted in yesterday's call - good to see consideration of type. Adding a note that the question of type in DRS that I mentioned was already raised to TASC as part of #22 . Good to see the honing in on appropriate authoritative registries of types.

@ianfore
Copy link

ianfore commented Mar 8, 2022

Does it make sense to put the organization name in the type though?

Working this through with a parallel example

  • application/bam
  • application/cram

seems preferable to

  • application/ga4gh+bam
  • application/ga4gh+cram

One one hand the latter seems fine - GA4GH is the custodian of bam and cram as standards. However, the former has the benefit of simplicity, and also of separating it from organizational structures. GA4GH brings these standards to the community, but does it want to make that visible in this way?

Understood that for the original topic here, the intent is to say what kind of Passport is being referred to.

Note: it's application/pdf not application/adobe+pdf or application/vnd.adobe.pdf

Separate, but related, point

If OTOH these are of interest only to people implementing GA4GH protocols, then a vendor tree media type like application/vnd.ga4gh.passport+jwt would probably be more appropriate.

Are vendor tree media types like this commonly accepted practice? Any pointers to other examples of this? It would help to see how this has been used and what situations it best applies to.

@jmarshall
Copy link
Member

@ianfore: I suggest you read RFC 6838 and look over the existing entries in IANA's Media Types registry. (Links to both can be found in previous comments on this issue.)

@ianfore
Copy link

ianfore commented Mar 8, 2022

@jmarshall Thanks for the pointer. From what I see in 3.2 Vendor Tree I would lean towards not using it for things like bam and cram. It seems to me these have legitimate credentials to being industry standard as opposed to being related to a vendor specific "product or set of products". True, potentially, for passport too - but that doesn't have the wide use yet that bam, cram, vcf etc. do.

The sense that vendor trees are particularly applicable for entities that "do not qualify as recognized standards-related organizations". Do we think GA4GH is "recognized"?

Signs in the IANA registry of some established patterns in our (health) domain that we might want to consider following along with

application/fhir+xml
application/fhir+json
application/dicom
application/dicom+xml
application/dicom+json

The standards organization names (HL7, NEMA) do not feature in those.

@jmarshall
Copy link
Member

LSG has bam/cram/vcf/etc in hand. They are indeed used by all sorts of programs and communities outwith GA4GH, so we believe they are appropriate for the Media Types standards tree.

@martin-kuba
Copy link
Author

I searched for examples of vendor tree usage in media types in my /etc/mailcap file:

.xls   application/vnd.ms-excel
.ppt   application/vnd.ms-powerpoint
.xlsx  application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.pptx  application/vnd.openxmlformats-officedocument.presentationml.presentation
.docx  application/vnd.openxmlformats-officedocument.wordprocessingml.document
.ods   application/vnd.oasis.opendocument.spreadsheet
.odp   application/vnd.oasis.opendocument.presentation
.odt   application/vnd.oasis.opendocument.text
.deb   application/vnd.debian.binary-package

I would say that GA4GH is like OASIS which is “a nonprofit consortium that works on the development, convergence, and adoption of open standards”.

@susanfairley
Copy link
Contributor

Thanks to those who have moved this conversation forward.

In relation to registering with IANA, following discussion with Peter Goodhand, I think GA4GH Inc is likely the appropriate entity to register these. Looking at the IANA registration form (https://www.iana.org/form/media-types), that would seem to match with them noting that usually the Change Controller is the standards body. I'd suggest that the contact be the secretariat email address for continuity but happy to hear comments.

@jb-adams
Copy link
Member

jb-adams commented Apr 1, 2022

I agree with @susanfairley that these should all be registered and maintained by GA4GH Inc. We can keep track of GA4GH entities that have been registered with IANA in this repo, and link out to the main registries for Namespaces and Media Types

Regarding the exact media type strings to register for the 2 Passport-related entities , I +1 @jmarshall 's application/vnd.ga4gh.passport+jwt for Passport JWT. By extension, I think we could also register application/vnd.ga4gh.workorder+jwt for the work order token

@ianfore
Copy link

ianfore commented Apr 4, 2022

Agree the vendor tree approach (application/vnd.ga4gh.*) seems appropriate for the media types needed for passport.

@martin-kuba
Copy link
Author

@jmarshall @ianfore @jb-adams @susanfairley The Passports/AAI Technical Working subgroup (a collaboration between the Data Security and DURI Work Streams) is finishing the AAI and Passport specifications version 1.2.

The specifications are going to use the media type application/vnd.ga4gh.passport+jwt for typ JWT header parameter and the URI urn:ga4gh:params:oauth:token-type:passport for requested_token_type parameter during OAuth 2.0 Token Exchange.

May I ask how is the registration of the media type and the urn:ga4gh URN prefix going?

@jmarshall
Copy link
Member

As @susanfairley previously noted, Media Types are registered by filling in the form at https://www.iana.org/form/media-types. For the vendor-tree media types that @martin-kuba needs, this is reasonably straightforward.

So the form will need to be submitted twice (probably with secretariat as contact and change controller, hence probably by secretariat), with Type Name = application, Subtype Name = Vendor Tree and ga4gh.passport+jwt or ga4gh.workorder+jwt respectively.

It would be good if the DURI folks could say something about whether there are any parameters (the parts appearing after a semicolon, like in e.g. text/plain;charset=iso-8859-1) used with passports or workorders, and in particular if they could draft something for the Security Considerations section.


For the desired standards tree registrations like application/bam and text/sam the process is a little more involved, and I'm still working through writing up a draft for that.

@martin-kuba
Copy link
Author

It would be good if the DURI folks could say something about whether there are any parameters (the parts appearing after a semicolon, like in e.g. text/plain;charset=iso-8859-1) used with passports or workorders, and in particular if they could draft something for the Security Considerations section.

No parameters are needed.

@andrewyatz
Copy link
Contributor

This issue has been flagged as TASC issue to begin resolving and will be added to the agenda

@jmarshall
Copy link
Member

See also suggestions around perhaps registering BGZF as distinct from gzip so that web browsers might be convinced not to decompress it in transit, discussed in samtools/hts-specs#734. (However my suspicion is that this may not be feasible or effective.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants