Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with parsing "<http://www.w3.org/2001/XMLSchema#gYear>" #806

Closed
Zinc-30 opened this issue Jan 15, 2018 · 6 comments · Fixed by #1315
Closed

problem with parsing "<http://www.w3.org/2001/XMLSchema#gYear>" #806

Zinc-30 opened this issue Jan 15, 2018 · 6 comments · Fixed by #1315
Labels
bug Something isn't working parsing Related to a parsing.
Milestone

Comments

@Zinc-30
Copy link

Zinc-30 commented Jan 15, 2018

Given input triple <http://dbpedia.org/resource/Australian_Labor_Party> <http://dbpedia.org/ontology/formationYear> "1891"^^<http://www.w3.org/2001/XMLSchema#gYear> .
After using the parser and print the object

  • g = Graph()
  • g.parse('./sample.nt',format='nt')
  • for s,p,o in g:
  • print o.n3()

it became "1891-01-01"^^<http://www.w3.org/2001/XMLSchema#gYear> , and this will cause trouble for some sparql sever.
Please help to fix this.

@Zinc-30 Zinc-30 changed the title problem when parsing "<http://www.w3.org/2001/XMLSchema#gYear>" problem with parsing "<http://www.w3.org/2001/XMLSchema#gYear>" Jan 15, 2018
@joernhees joernhees added bug Something isn't working parsing Related to a parsing. labels Feb 26, 2018
@joernhees joernhees added this to the rdflib 5.0.0 milestone Feb 26, 2018
@joernhees
Copy link
Member

yes, i agree that this is weird at least... i'd say it's an unfortunate default caused by this:

URIRef(_XSD_PFX + 'gYear'): parse_date,
using the same parsing code for gYear as well as date resulting in years being represented by their first day instead of just the year number :-/

I guess in the long run, we should change this default (backwards incompatible!), which would fix the parser as well.

If you're after a short term workaround for individual Literals: use the normalize arg:

In [1]: import rdflib
RDFLib Version: 5.0.0-dev

In [2]: g = rdflib.Graph()

In [3]: l = rdflib.Literal("2008", datatype=rdflib.XSD.gYear)

In [4]: l
Out[4]: rdflib.term.Literal(u'2008-01-01', datatype=rdflib.term.URIRef(u'http://www.w3.org/2001/XMLSchema#gYear'))

In [5]: l = rdflib.Literal("2008", datatype=rdflib.XSD.gYear, normalize=False)

In [6]: l
Out[6]: rdflib.term.Literal(u'2008', datatype=rdflib.term.URIRef(u'http://www.w3.org/2001/XMLSchema#gYear'))

As a workaround for the parser, you can probably monkey patch the _toPythonMapping dict locally like this (NOT A GOOD IDEA):

In [14]: rdflib.term._toPythonMapping.pop(rdflib.XSD['gYear'])
Out[14]: <function isodate.isodates.parse_date>

In [15]: rdflib.Literal("2008", datatype=rdflib.XSD.gYear)
Out[15]: rdflib.term.Literal(u'2008', datatype=rdflib.term.URIRef(u'http://www.w3.org/2001/XMLSchema#gYear'))

@essepuntato
Copy link

essepuntato commented Mar 14, 2018

Hi @joernhees

Is there a particular reason why your hack proposed there for the parser is not a good idea? Are there possible issues by doing that? Sorry to ask, but I'm in the situation of using RDFLIB urgently for a project of mine, and this would be a crucial issue to be addressed – thus I'm in favour of using your hack unless there is a particular drawback I cannot see or control.

@joernhees
Copy link
Member

@essepuntato mostly depends on where it's used. The problem is that it's patching the _toPythonMapping within the rdflib.term module, so anyone using rdflib together with your code will have a modified version... I'd say in a controlled environment, you probably know what effects that will have, so there it might be a short term solution. However, from own experience, at some point some of your code is re-used, maybe even released. If that hack is still in there at that point, things will behave in unexpected ways for others.

so if there's the slightest chance that anyone else will ever use rdflib and your code together, maybe a better way is to at least restore the way that rdflib was working before, e.g.:

if rdflib.__version__ == '4.2.2':
    x = rdflib.term._toPythonMapping.pop(rdflib.XSD['gYear'])

... your code here ...

if rdflib.__version__ == '4.2.2':
    rdflib.term._toPythonMapping[rdflib.XSD['gYear']] = x

@rybesh
Copy link

rybesh commented May 31, 2018

Seems to be a duplicate of #747

@aucampia
Copy link
Member

Can we close this and use #747 instead?

@white-gecko
Copy link
Member

Duplicate of #747

@white-gecko white-gecko marked this as a duplicate of #747 May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parsing Related to a parsing.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants