Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow relative IRIs for @vocab #72

Closed
gkellogg opened this issue Sep 28, 2018 · 5 comments
Closed

Allow relative IRIs for @vocab #72

gkellogg opened this issue Sep 28, 2018 · 5 comments

Comments

@gkellogg
Copy link
Member

JSON-LD separates handling of IRI conversions for document- and vocabulary-relative IRIs. At times, it is useful to have more vocabulary items be considered in the document space. A change was made to allow the empty string ("") as a value for @vocab, but this may not be adequate.

Issue #56 somewhat complains about this, as does #37.

Currently, the Syntax document describes this in §4.1.2.1 Using the Document Base as the Default Vocabulary, but the use of a term "#breweats" is not a natural JSON property. If @vocab could be set to "#", this would allow a more natural expression.

The API document has a carve-out for the empty string, but otherwise require @vocab to resolve to an absolute IRI:

Otherwise, if value is an absolute IRI or blank node identifier, the vocabulary mapping of result is set to value. If it is not an absolute IRI, or a blank node identifier, an invalid vocab mapping error has been detected and processing is aborted.

The proposal would be to treat things that are not absolute IRIs as IRIs relative to the document base as does the current empty-string carveout:

Otherwise, if value the empty string (""), the effective value is the current base IRI.

Resolving vocabulary-relative IRIs is done with simply concatenation, so the proposal would be to concatenate any non-absolute IRI value of @vocab to base IRI by concatenating the value of @vocab to base IRI; if the result is not then a valid absolute IRI we would continue to generate an invalid vocab mapping error. Note that this sill not invalidate any 1.0 documents, but will allow things which were previously not valid.

@gkellogg gkellogg self-assigned this Oct 26, 2018
@iherman
Copy link
Member

iherman commented Oct 27, 2018

This issue was discussed in a meeting.

  • RESOLVED: Allow IRIs to be constructed by string concatenation with multiple @vocabs, with appropriate security consideration section
  • ACTION: Rob Sanderson to make a security consideration issue re relative IRI concatenation
  • ACTION: Rob Sanderson to create security consideration re javascript URIs and relative IRIs
View the transcript Gregg Kellogg: in JSON-LD, vocab-relative and document-relative IRIs are resolved differently. we’ve already looked at this problem
… and offered the ability to set @vocab=""
… which allows vocab to be resolved against the document base
… motivated at least in part because in other RDF formats, that distinction doesn’t exist
… so there was a parity issue against other serializations
… this issue goes further, and lets @vocab get set to any relative URI, which would then be evaluated against the document base
… the proposal includes that if a @vocab is already set and a new relative @vocab comes along, one simply string-appends the new one to the old one
Rob Sanderson: in the case of a base that came from HTTP with a # on the end, that would get lost
Gregg Kellogg: this also addresses the problem that Manu raised in the context of blank-node-properties.
Rob Sanderson: if you set vocab to ../# and you had example.org/ns then you get example.org/ns../#
Ivan Herman: as an editorial matter we must make very clear that this is string concatenation, not IRI concatenation
rob: are there good rules for determining relative vs. absolute IRIs?
Ivan Herman: Look for the scheme
Ivan Herman: I am almost sure that the URI spec defines that very clearly
Rob Sanderson: but this could be a security problem if a malicious actor sets a CURI prefix of “http” to some malicious address
Ivan Herman: also the same thing with base
Gregg Kellogg: We can’t really know
Rob Sanderson: we can just advise people of the security concerns
Action #1: Rob Sanderson to make a security consideration issue re relative IRI concatenation
Ivan Herman: do we check for “acceptable” scheme?
… what about Javascript URIs (bookmarklets)
Rob Sanderson: “@vocab”: “javascript:”
Action #2: Rob Sanderson to create security consideration re javascript URIs and relative IRIs
Gregg Kellogg: we don’t now check for defined schemes
Benjamin Young: it’s the responsibility of the document loader to worry about this
… it could just choose not to resolve troubling URIs
Ivan Herman: so you put together a URI from the JSON-LD (could happen in many ways).
… at that level, do we add a security check?
Benjamin Young: that’s the job of the person using the URI
Rob Sanderson: Or rather "@vocab": "javascript:document.alert('hi!');"
Benjamin Young: this isn’t a job for the syntax
… e.g. data: can hide anything
… data: URIs
Gregg Kellogg: this could be used for maliciousness, but it’s on the users of the URIs to be careful
Rob Sanderson: we don’t do path expansion, we’re doing string concat here. so we won’t catch a lot of stuff
Benjamin Young: but the advantage of string concat is that it supports non-pathy URIs
Gregg Kellogg: @vocab is used only for properties
several: and generally people don’t dereference properties, and nothing in our algorithms says that they should
Gregg Kellogg: we can modify the API to return only URIs of some form.
Benjamin Young: we can say “we never use these URIs, so there’s no concern w/i JSON-LD, but if users choose to use them, the usual concerns about URIs from the wild apply”
Gregg Kellogg: we might consider softening the current restrictions in 3.6.3
… to use IRI expansion and not string concatenation
Rob Sanderson: "@vocab": "http://example.org/ns/" and then "@vocab": "/"
Rob Sanderson: currently you get "http://example.org/ns//" which is unexpected for relative IRIs
Rob Sanderson: And the expectation would be "http://example.org/"
Ivan Herman: are we making the distinction between the two kinds of resolution disappear?
Gregg Kellogg: still the issue of concatenation vs. IRI resolution
Gregg Kellogg: how do I establish @vocab, vs. how do I use it?
Ivan Herman: if we are in @vocab we do string concat, it’s clean
… and users just have to know about that
… let’s don’t mingle concatenation and IRI resolution
Gregg Kellogg: we’re only interested in resolving `@vocab when it is relative, that’s all
Ivan Herman: You’re right, but that’s about implementing the system
… I’m talking about end users
… if we just have string concat for @vocab, that’s clean, I understand that
… even if we do IRI resolution instead over here
… somewhere else. but any problems with doubled slashes, etc., are users problems to deal with
Rob Sanderson: if we went all the way to have @vocab itself computed and vocab terms resolved via IRI resolution, it breaks things
… so lets stick cleanly to string concatenation
Gregg Kellogg: Ok, but these are different use cases.
Ivan Herman: there is an actual regexp to recognize absolute IRIs, so we can rely on that
Harold Solbrig: doesn’t the CURI spec speak to this?
Gregg Kellogg: we don’t use CURIEs, either
Harold Solbrig: why are we using something else?
Gregg Kellogg: every RDF serialization uses its own way to discuss short URIs
… (gkellogg then names more than you would think he could off the top of his head)
Rob Sanderson: we can’t necessarily construct all legit IRIs, but most of what we can’t is unusual enough not to be problematic
… if we stick with string concat, we avoid this
… if you construct stupid @vocabs, that’s your problem
Proposed resolution: Allow IRIs to be constructed by string concatenation with multiple @vocabs (Rob Sanderson)
Proposed resolution: Allow IRIs to be constructed by string concatenation with multiple @vocabs, with appropriate security consideration section (Rob Sanderson)
Adam Soroka: +1
Gregg Kellogg: +0.9
Rob Sanderson: +1
Simon Steyskal: +1
Adam Soroka: +1
Harold Solbrig: +1
Benjamin Young: +1
Resolution #3: Allow IRIs to be constructed by string concatenation with multiple @vocabs, with appropriate security consideration section
Ivan Herman: +1
Benjamin Young: the Chaucer quote fwiw https://english.stackexchange.com/questions/139073/meaning-of-if-gold-rust-what-shall-the-iron-do

@BigBlueHat
Copy link
Member

@azaroth42 you're noted in a couple of actions above:

ACTION: Rob Sanderson to make a security consideration issue re relative IRI concatenation
ACTION: Rob Sanderson to create security consideration re javascript URIs and relative IRIs

Did those get made? They're not linked here if so. 😕

@iherman are the actions we take via Zakim recorded somewhere? I think most of them are disappearing or remain lurking in these call logs on GitHub.

Maybe if we @ mention the GitHub name for the person taking the action that would help close the loop and make it easier to curate all this stuff? No idea...but I would like it to be simpler. 😜

@iherman
Copy link
Member

iherman commented Jan 11, 2019

@iherman are the actions we take via Zakim recorded somewhere? I think most of them are disappearing or remain lurking in these call logs on GitHub

I am not sure "lurking" is a good term, but indeed they only appear in the issue comments like #72 (comment).

@gkellogg
Copy link
Member Author

Perhaps we should revisit the idea of a minute index, highlighting the topics, resolutions and actions of each meeting on one page.

@iherman
Copy link
Member

iherman commented Jan 12, 2019

This issue was discussed in a meeting.

  • RESOLVED: address security concerns related to relative URIs for @vocab in current PRs before closing #72
View the transcript Allow relative IRIs for @vocab
Benjamin Young: #72
syntax - #114
api - w3c/json-ld-api#58
Gregg Kellogg: when we decided to satisfy the need for using the base as the @vocab,
… I came up with the idea of using an empty string as @vocab.
… When I implemented it, what I ended up doing was resolve @vocab as a relative URI against the base.
… So why not allow that in the spec…
Ivan Herman: ship it! :-)
Gregg Kellogg: It is uncontroversial, as it does not change past behaviour.
… Now you can simply use @vocab: “#”, to use relative URIs everywhere.
Rob Sanderson: –> #72 (comment)
Rob Sanderson: is this the issue we discussed this during TPAC?
… if you set vocab to ../# and you had example.org/ns then you get example.org/ns../#
Gregg Kellogg: I have to improve the text to indicate when standard URI resolution is used, and when string concatenation is used
Proposed resolution: address security concerns related to relative URIs for @vocab in current PRs before closing #72 (Benjamin Young)
Ivan Herman: +1
Benjamin Young: +1
Ivan Herman: please close issue 72 when you clarify that and merge it.
Gregg Kellogg: +1
Rob Sanderson: +1
Ivan Herman: Let’s reduce the number of open issue
Pierre-Antoine Champin: +1
David I. Lehn: +1
Resolution #4: address security concerns related to relative URIs for @vocab in current PRs before closing #72
Simon Steyskal: +1

kazarena added a commit to piprate/json-gold that referenced this issue Feb 24, 2019
@azaroth42 azaroth42 added the satisfied Requirement Satisfied label Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants