Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

namespace.py include colon in ALLOWED_NAME_CHARS #663

Merged
merged 4 commits into from
May 14, 2018

Conversation

tgbugs
Copy link
Contributor

@tgbugs tgbugs commented Oct 30, 2016

This commit updates the behavior of is_ncname and
split_uri to match the behavior described here:
https://www.w3.org/TR/html4/types.html#type-name
(found via
http://stackoverflow.com/questions/2053132/is-a-colon-safe-for-friendly-url-use).
This results in the correct splitting of uris such as
http://neurolex.org/wiki/FMA:7191. Extra note that not
all user agents convert : -> %3A since they are not
required to by the spec (RFC3986).

This commit updates the behavior of is_ncname and
split_uri to match the behavior described here:
https://www.w3.org/TR/html4/types.html#type-name
(found via
http://stackoverflow.com/questions/2053132/is-a-colon-safe-for-friendly-url-use).
This results in the correct splitting of uris such as
http://neurolex.org/wiki/FMA:7191. Extra note that not
all user agents conver : -> %3A since they are not
required to by the spec (RFC3986).
@gromgull
Copy link
Member

This makes four of the rdfcore tests fail :(

@gromgull
Copy link
Member

This was also changed in turtle at some point, after we got out turtle parser. We would also need to update the turtle parser to be able to parse those localnames with :s

@gromgull gromgull added this to the rdflib 5.0.0 milestone Jan 12, 2017
@tgbugs
Copy link
Contributor Author

tgbugs commented Jan 12, 2017

When I originally created this I had not chased down the sources of the failures, but as you point it it does break the behvior of the core rdf specification (I have tracked this down for changes I make to NAME_START_CATEGORIES += ["Nd"] which break similar things). I'm not entirely sure what to do about it, but I'm fairly certain that other parsers like the ones used by OWLAPI do have this behavior.

@tgbugs
Copy link
Contributor Author

tgbugs commented Apr 19, 2017

Removing the : when inside is_ncname allows the tests to pass and produces the desired results when serializing. The parser doesn't seem to have any issues reading something like NLXWIKI:FMA:1234567 -> http://neurolex.org/wiki/FMA:1234567 so I think this might be a reasonable solution.

@gromgull
Copy link
Member

Looks good! (And apologies for the slow turn around here)

@gromgull gromgull merged commit 9da21db into RDFLib:master May 14, 2018
tgbugs added a commit to tgbugs/rdflib that referenced this pull request May 14, 2018
This merge includes the changes from

RDFLib#663
RDFLib#793
RDFLib#827

which were branches

tgbugs:fix-uri-spec
tgbugs:total-order-patch
tgbugs:closed-namespace-attribute-error

respectively.
@RDFLib RDFLib deleted a comment from coveralls Oct 28, 2018
@RDFLib RDFLib deleted a comment from coveralls Oct 28, 2018
@joernhees
Copy link
Member

hmmpf, this breaks using our qnames on virtuoso sparql endpoints...

@joernhees joernhees added the parsing Related to a parsing. label Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parsing Related to a parsing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants