A Java CLI to upload RDF files, and execute SPARQL queries from string, URL or multiple files using RDF4J.
- The user can execute SPARQL queries by
- Passing a SPARQL query string in
-sp
param - Providing a URL in
-f
param - Providing the URL of a GitHub repository containing
.rq
files to execute in-f
param - Providing the path to a directory where the queries are stored in
.rq
text files and executed in the alphabetical order of their filename. - A YAML file with multiple ordered queries.
- Passing a SPARQL query string in
- Update, construct and select SPARQL operations supported.
- It is possible to optionally define username and password for the SPARQL endpoint.
See the Data2Services framework documentation to run d2s-sparql-operations as part of workflows to generate RDF knowledge graph from structured data.
Download the .jar
file from the latest GitHub release here, you can use this command to do it automatically in a Bash terminal:
wget https://github.com/MaastrichtU-IDS/d2s-sparql-operations/releases/latest/download/sparql-operations.jar
Move the jar somewhere you can call it easily, e.g. in a bin
folder in your home folder:
mkdir -p ~/bin && mv sparql-operations.jar ~/bin/sparql-operations.jar
- Run the jar to upload RDF files:
java -jar ~/bin/sparql-operations.jar -o upload -i "*.ttl" -e "https://graphdb.dumontierlab.com/repositories/test/statements" -u 'username' -p 'password' -g "http://my-graph.com"
Optionally use
-g "http://my-graph.com"
to specify a graph to upload the data to
You can also define the username and password using environment variables:
export D2S_USERNAME=myusername
export D2S_PASSWORD=mypassword
- Execute a SPARQL query provided as argument:
java -jar ~/bin/sparql-operations.jar -o select -q "SELECT * WHERE {?s ?p ?o .} LIMIT 10" -e "https://graphdb.dumontierlab.com/repositories/test"
See below for more example to execute SPARQL queries, and various operations.
Compile the jar file from the source code:
mvn clean package
Move it:
mv target/sparql-operations-*-jar-with-dependencies.jar ~/bin/sparql-operations.jar
Available on DockerHub the latest
image is automatically built from latest branch master
commit on GitHub.
docker pull umids/d2s-sparql-operations
You can also clone the GitHub repository and build the docker image locally (unecessary if you do docker pull
)
git clone https://github.com/MaastrichtU-IDS/d2s-sparql-operations
cd d2s-sparql-operations
docker build -t umids/d2s-sparql-operations .
N.B.: you will need to remove the \
and make the docker run
commands one-line for Windows PowerShell.
docker run -it --rm umids/d2s-sparql-operations -h
Upload RDF files to a SPARQL endpoint:
docker run -it --rm -v $(pwd):/data umids/d2s-sparql-operations -o upload \
-i "*.ttl" \
-e "https://graphdb.dumontierlab.com/repositories/test/statements" \
-u $USERNAME -p $PASSWORD \
-g "http://my-graph.com"
On DBpedia using a SPARQL query string as argument.
docker run -it --rm umids/d2s-sparql-operations -o select \
-q "select distinct ?Concept where {[] a ?Concept} LIMIT 10" \
-e "http://dbpedia.org/sparql"
Multiple INSERT
on graphdb.dumontierlab.com, using files in a repository from the local file system.
docker run -it --rm umids/d2s-sparql-operations \
-e "https://graphdb.dumontierlab.com" -r "test" \
#-e "https://graphdb.dumontierlab.com/repositories/test/statements" \
-o update -u $USERNAME -p $PASSWORD \
-i "https://github.com/MaastrichtU-IDS/d2s-sparql-operations/tree/master/src/main/resources/insert-examples"
- Note that GraphDB and RDF4J Server require to add
/statements
at the end of the endpoint URL when doing an update.
On graphdb.dumontierlab.com using GitHub URL to get the SPARQL query from a file.
docker run -it --rm umids/d2s-sparql-operations -o construct \
-e "https://graphdb.dumontierlab.com/repositories/ncats-red-kg" \
-i "https://raw.githubusercontent.com/MaastrichtU-IDS/d2s-sparql-operations/master/src/main/resources/example-construct-pathways.rq"
We crawl the example GitHub repository and execute each .rq
file.
docker run -it --rm umids/d2s-sparql-operations \
-o select -e "http://dbpedia.org/sparql" \
-i "https://github.com/MaastrichtU-IDS/d2s-sparql-operations/tree/master/src/main/resources/select-examples"
Crawling GitHub repository from URL is based on HTML parsing, hence might be unstable
A YAML file can be used to provide multiple ordered queries. See example from GitHub.
docker run -it --rm umids/d2s-sparql-operations \
-o select -e "http://dbpedia.org/sparql" \
-i "https://raw.githubusercontent.com/MaastrichtU-IDS/d2s-sparql-operations/master/src/main/resources/example-queries.yaml"
Beta
To split an object into multiple statements using a delimiter, and insert the statements generated by the split in the same graph.
E.g.: a statement with value "1234,345,768" would be splitted in 3 statements "1234", "345" and "768".
docker run -it \
umids/d2s-sparql-operations -op split \
--split-property "http://w3id.org/biolink/vocab/has_participant" \
--split-class "http://w3id.org/biolink/vocab/GeneGrouping" \
--split-delimiter "," \
--split-delete \ # Delete the splitted statement
--uri-expansion "https://w3id.org/d2s/" \ # Use 'infer' to do it automatically using prefixcommons
#--trim-delimiter '"' \
-e "https://graphdb.dumontierlab.com" \ # RDF4J server URL
-rep "test" \ # RDF4J server repository ID
-u USERNAME -pw PASSWORD
# For SPARQLRepository
# -e "https://graphdb.dumontierlab.com/repositories/test" \
# -uep "https://graphdb.dumontierlab.com/repositories/test/statements" \
3 variables can be set in the SPARQL queries using a ?_
: ?_input
, ?_output
and ?_service
. See example:
INSERT {
GRAPH <?_output> {
?Concept a <https://w3id.org/d2s/Concept> .
}
} WHERE {
SERVICE <?_service> {
GRAPH <?_input> {
SELECT * {
[] a ?Concept .
} LIMIT 10
} } }
Execute:
docker run -it --rm umids/d2s-sparql-operations \
-op update -e "https://graphdb.dumontierlab.com/repositories/test/statements" \
-u $USERNAME -pw $PASSWORD \
-i "https://raw.githubusercontent.com/MaastrichtU-IDS/d2s-sparql-operations/master/src/main/resources/example-insert-variables.rq" \
--var-input http://www.ontotext.com/explicit \
--var-output https://w3id.org/d2s/output \
--var-service http://localhost:7200/repositories/test
From data2services-transform-repository, use a federated query to transform generic RDF generated by AutoR2RML and xml2rdf to the BioLink model, and load it to a different repository.
# DrugBank
docker run -it --rm -v "$PWD/sparql/insert-biolink/drugbank":/data \
umids/d2s-sparql-operations \
-i "/data" -u USERNAME -pw PASSWORD \
-e "https://graphdb.dumontierlab.com/repositories/ncats-test/statements" \
--var-service http://localhost:7200/repositories/test \
--var-input http://data2services/graph/xml2rdf \
--var-output https://w3id.org/d2s/graph/biolink/drugbank
# HGNC
docker run -it --rm -v "$PWD/sparql/insert-biolink/hgnc":/data \
umids/d2s-sparql-operations \
-i "/data" -u USERNAME -pw PASSWORD \
-e "https://graphdb.dumontierlab.com/repositories/ncats-test/statements" \
--var-service http://localhost:7200/repositories/test \
--var-input http://data2services/graph/autor2rml \
--var-output https://w3id.org/d2s/graph/biolink/hgnc