Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of UTF-8 in SPARQL Update queries #356

Closed
wants to merge 1 commit into from
Closed

Conversation

KMax
Copy link
Contributor

@KMax KMax commented Feb 17, 2014

More information at rdflib-dev group -> https://groups.google.com/d/msg/rdflib-dev/fYiUeOLvMXM/k8VvcdkgwWgJ

Tested with OpenRDF Sesame 2.7.10.

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.04%) when pulling 0fe5885 on KMax:master into c429c4e on RDFLib:master.

@joernhees
Copy link
Member

hmm, what encoding was used before hard wiring this to utf-8?
(i like utf-8, but i fear this could cause problems with other endpoints)

@gromgull
Copy link
Member

I assume it was as god intended all strings to be - 7bit ascii!

@KMax
Copy link
Contributor Author

KMax commented Feb 17, 2014

Right, before that ascii was used (see the discussion in the group). Unfortunately, I don't have any other endpoint running to do testes, but I suppose utf-8 should be used everywhere.

Also, as understand Literal is a subclass of unicode datatype, so why the URL parameter was not encoded with utf-8.

@KMax
Copy link
Contributor Author

KMax commented Feb 19, 2014

Is there a way to write a test for these changes without usage of an external endpoint? Or some manual tests are enough to accept the changes?

@gromgull
Copy link
Member

gromgull commented Mar 1, 2014

@KMax - we don't really have a good way to test it.

I was going to say you could test against the rdflib-web endpoint, but then remembered we still haven't implement UPDATE :)

Trying to read up on what the "correct" thing to do is, all I can find in the SPARQL protocol about encoding is this: http://www.w3.org/TR/sparql11-protocol/#query-via-post-direct

For direct POST, "Note that UTF-8 is the only valid charset here."

but this for directly having the update statements in the body, this is for form encoded querying.

I'll test it against fuseki, and if it works we'll merge, ok?

@gromgull
Copy link
Member

gromgull commented Mar 2, 2014

So, I looked at Fuseki's web-interface, if posting an UPDATE statement with non-ascii unicode chars:

INSERT DATA { <urn:å> <urn:ü> <urn:ß> .  }

Chrome posts a plain content-type header (no charset) and this body:

update=INSERT+DATA+%7B+%3Curn%3A%C3%A5%3E+%3Curn%3A%C3%BC%3E+%3Curn%3A%C3%9F%3E+.++%7D

This is UTF-8, percent encoded.

@gromgull gromgull closed this in a02cefc Mar 2, 2014
@gromgull
Copy link
Member

gromgull commented Mar 2, 2014

It works with with fuseki as well!

Thanks!

@joernhees
Copy link
Member

we should fix those random test failures when endpoints are down...

@gromgull
Copy link
Member

gromgull commented Mar 2, 2014

cough cough #269 #256 ...

@joernhees
Copy link
Member

true... but see #364 nevertheless... it seems even though tests should be skipped on travis they aren't

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants