Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to access SPARQL endpoint? #333

Closed
VladimirAlexiev opened this issue Aug 7, 2023 · 3 comments
Closed

How to access SPARQL endpoint? #333

VladimirAlexiev opened this issue Aug 7, 2023 · 3 comments
Assignees

Comments

@VladimirAlexiev
Copy link

https://docs.datacommons.org/api/ says

Graph Query/SPARQL: Given a subgraph where some of the nodes are variables, retrieve possible matches. This corresponds to a subset of the graph query language SPARQL.

But where can I make SPARQL queries?
Looking at https://github.com/datacommonsorg/api-python/blob/master/datacommons/sparql.py#L77, I guessed that's at /query.

But https://www.datacommons.org/query?sparql=foo says

The requested URL was not found on the server.

@shifucun
Copy link
Contributor

Hi Vladimir, the SPARQL endpoint is documented at https://docs.datacommons.org/api/rest/v1/query

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Dec 31, 2023

Thanks @shifucun !
Please add a link from "Graph Query/SPARQL" to that API, since https://docs.datacommons.org/api/rest/v1/query is HIDDEN at the https://docs.datacommons.org/api/ page.

Also, please document the variant of pseudo-SPARQL that you support.
(It would be better to support real SPARQL, but I'm not sure how easy this is.)

Here are examples of valid SPARQL queries that are rejected by your API.

  • select*{?s?p?o}limit 10
    {"code":3,"message":"Invalid sparql query string\nselect*{?s?p?o}limit 10"}$
    The reason is that each var (or the leading var?) must have a type specified. Which is done with typeOf rather than rdf:type
  • select ?s {?s typeOf City} limit 10
    {"code":3,"message":"Invalid sparql query string\nselect ?s {?s typeOf City} limit 10"}
    The reason is that where is missing
  • select ?s where {?s typeOf City; ?p ?o} limit 10"}
    {"code":3,"message":"Node should be string, got [City ?p ?o] of type []string"}
    Seemingly one cannot use ; shortcut
  • select ?s ?p ?o where {?s typeOf City. ?s ?p ?o} limit 10
    {"header":["?s"]}
    The query is accepted but rather than returning all props of a City, it returns nothing
  • select * where {?s typeOf City. ?s name ?name}limit+10
    {"code":3,"message":"Invalid sparql query string\nselect* where{?s typeOf City.?s name?name}limit 10"}
    The reason is that * is not supported: all desired variables must be listed

It is also worth noting that one can use GET to make queries, although not in full compliance with https://www.w3.org/TR/sparql11-protocol/#query-via-get :

curl -HX-API-Key:AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI 'https://api.datacommons.org/v1/query?sparql=select?s+where\{?s+typeOf+City\}limit+10'
{"header":["?s"],"rows":[{"cells":[{"value":"geoId/2649160"}]},{"cells":[{"value":"geoId/1820206"}]},{"cells":[{"value":"geoId/2634860"}]},{"cells":[{"value":"geoId/2043025"}]},{"cells":[{"value":"geoId/1992592"}]},{"cells":[{"value":"geoId/2624220"}]},{"cells":[{"value":"geoId/1775744"}]},{"cells":[{"value":"geoId/2751838"}]},{"cells":[{"value":"geoId/2016050"}]},{"cells":[{"value":"geoId/1876598"}]}]}

I escaped the brackets because by default curl supports "globbing" https://stackoverflow.com/questions/8333920/passing-a-url-with-brackets-to-curl. A better way is to use the -g --globoff option:

curl -HX-API-Key:AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI -g 'https://api.datacommons.org/v1/query?sparql=select?s+where{?s+typeOf+City}limit+10'

It is strange that the same query but with name added returns completely different cities.

curl -HX-API-Key:AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI -g 'https://api.datacommons.org/v1/query?sparql=select?s?name+where{?s+typeOf+City.?s+name?name}limit+10'
{"header":["?s","?name"],"rows":[{"cells":[{"value":"wikidataId/Q1377518"},{"value":"Highgate"}]},{"cells":[{"value":"wikidataId/Q13902505"},{"value":"Westbrook"}]},{"cells":[{"value":"wikidataId/Q14080276"},{"value":"Askham"}]},{"cells":[{"value":"wikidataId/Q1383013"},{"value":"Ewarton"}]},{"cells":[{"value":"wikidataId/Q1374967"},{"value":"Euroea in Phoenicia"}]},{"cells":[{"value":"wikidataId/Q14060205"},{"value":"Maghrawa"}]},{"cells":[{"value":"wikidataId/Q1405461"},{"value":"Ulmarra"}]},{"cells":[{"value":"wikidataId/Q1396431"},{"value":"Igaliku Kujalleq"}]},{"cells":[{"value":"wikidataId/Q1391691"},{"value":"Nuriootpa"}]},{"cells":[{"value":"wikidataId/Q1382237"},{"value":"Churchbridge"}]}]}

The first hit is geoId/2634860 Greendale Township vs wikidataId/Q1377518 Highgate.
It's true that a query without order by can return any results, but I would expect the results to be predictable.

Cheers!

@kmoscoe
Copy link
Contributor

kmoscoe commented Jul 9, 2024

Hello Vladimir,
We've now added support for a SPARQL endpoint in v2 of the Data Commons APIs. Documentation is here: https://docs.datacommons.org/api/rest/v2/sparql.html

As you note, SPARQL support is limited. GET requests have character limits and so are not recommended. The requirement for a WHERE clause is actually a bug, which we plan on fixing.

Hope this helps,
Kara

@kmoscoe kmoscoe closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants