Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse JSON-LD document if the scheme of @base IRI is non-standard #97

Open
anatoly-scherbakov opened this issue Nov 29, 2020 · 1 comment

Comments

@anatoly-scherbakov
Copy link

Problem

Full code of the example is here: https://gist.github.com/anatoly-scherbakov/9410aba3af518e1a3301b32b693f2579

I am trying to import a JSON-LD document into an RDFLib in-memory graph instance. Versions of the software:

rdflib==5.0.0
rdflib-jsonld==0.5.0
PyLD==2.0.3

The document I am working with contains a @base IRI in its @context.

Expected result

I expect the import to work correctly if the @base value is a correct IRI regardless of its protocol. But it seems that the import works with these:

        'http://robotics.example.com/robots/',
        'https://robotics.example.com/robots/',
        'ftp://robotics.example.com/robots/',
        'file://robotics.example.com/robots/',

but does not work with these:

        'ipns://robotics.example.com/robots/',
        'tftp://robotics.example.com/robots/',
        'ntp://robotics.example.com/robots/',
        'local://robotics.example.com/robots/',

In the latter case, I just get an empty graph.

I tried to find a hardcoded list of allowed schemas in rdflib, rdflib-jsonld, and pyld repositories, but did not succeed. Maybe you could point me to the right direction? Thank you!

@craig-willis
Copy link

Just ran into this issue trying to parse an example from https://www.researchobject.org/ro-crate/1.1/appendix/relative-uris.html#establishing-a-base-uri-inside-a-zip-file.

It appears to be caused by use of urllib.parse.urljoin which only supports a specific set of schemes. There is a documented workaround (https://bugs.python.org/issue18828#msg196794):

import urllib.parse
urllib.parse.uses_relative.append('<scheme>')
urllib.parse.uses_netloc.append('<scheme>')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants