ROBOT can execute SPARQL queries against an ontology. The verify command is similar, but is used to test that an ontology conforms to the specified rules.
The query
command can execute SPARQL ASK, SELECT, and CONSTRUCT queries by using the --query
option with two arguments: a query file and an output file. The output file will only be written if the query returns some results. The output format will be inferred from the output file extension, or you can use the --format
option.
ASK always produces true
or false
.
SELECT produces a table, if there are any results, defaulting to CSV format. For example:
robot query --input nucleus.owl \
--query cell_part.sparql results/cell_part.csv
This produces cell_part.csv
.
CONSTRUCT produces RDF data, if there are any results, defaulting to Turtle format:
robot query --format ttl \
--input nucleus.owl \
--query part_of.sparql results/part_of.ttl
This produces part_of.ttl
.
The --query
option can be repeated to execute multiple queries, which may be of different types.
robot query --input nucleus.owl \
--query cell_part.sparql results/cell_part.csv \
--query part_of.sparql results/part_of.ttl
Instead of specifying one or more pairs (query file, output file), you can specify a single --output-dir
and use the --queries
option to provide one or more queries of any type. Each output file will be written to the output directory with the same base name as the query file that produced it. For example the foo.sparql
query file will produce the foo.csv
file. The output directory must exist.
robot query --input nucleus.owl \
--queries cell_part_ask.sparql \
--output-dir results/
By default, query
ignores import statements. To include all imports as named graphs, add --use-graphs true
.
robot query --input imports.owl \
--use-graphs true --catalog catalog.xml \
--query named_graph.sparql results/named_graph.csv
The example above also uses the global --catalog
option to specify the catalog file for the import mapping. The default graph is the union of all graphs, which allows querying over an ontology and all its imports.
The names of the graphs correspond to the ontology IRIs of the imports. If the import does not have an ontology IRI, one will be automatically generated. Running query
with the -vv
flag will print the names of all graphs as they are added.
The query
command also supports SPARQL UPDATE to insert and delete triples.
robot query --input nucleus.owl \
--update update.ru \
--output results/nucleus_update.owl
When using SPARQL update, you can either provide an --output
for the updated ontology, or chain it into another command.
You can perform multiple updates in one command to reduce time spent loading and saving the ontology. Updates are processed in the order that they are input.
robot query --input nucleus.owl \
--update update.ru \
--update revert.ru \
--output results/nucleus.owl
The --update
option only updates the ontology itself, not any of the imports.
Warning: The output of SPARQL updates will not include xsd:string
datatypes, because xsd:string
is considered implicit in RDF version 1.1. This behaviour differs from other ROBOT commands, where xsd:string
datatypes from the input are maintained in the output.
For very large ontologies, saving heap memory might be beneficial. You can use --temporary-file true
to ensure that intermediate results will be stored to a temporary file. Note that this makes the execution slower.
robot query --input nucleus.owl \
--update update.ru \
--temporary-file true \
--output results/nucleus_update_2.owl
For very large ontologies, it may be beneficial to load the ontology to a mapping file on disk rather than loading it into memory. This is supported by Jena TDB Datasets. To execute a query with TDB, use --tdb true
:
robot query --input nucleus.ttl --tdb true \
--query cell_part.sparql results/cell_part.csv
Please note that this will only work with ontologies in RDF/XML or Turtle syntax, and not with Manchester Syntax. Attempting to load an ontology in a different syntax will result in a Syntax Error. ROBOT will create a directory to store the ontology as a dataset, which defaults to .tdb
. You can change the location of the TDB directory by using --tdb-directory <directory>
. If a --tdb-directory
is specified, you do not need to include --tdb true
. If you've already created a TDB directory, you can query from the TDB dataset without needing to specify an --input
- just include the --tdb-directory
.
Once the query operation is complete, ROBOT will remove the TDB directory. If you are performing many query commands on one ontology, you can include --keep-tdb-mappings true
to prevent ROBOT from removing the TDB directory. This will greatly reduce the execution time of subsequent queries.
The ontology is never loaded as an OWLOntology
object, since doing so loads the whole ontology into memory. Therefore, TDB cannot be used while chaining commands or with the --update
option.
Finally, please be aware that ROBOT uses standard TDB (TDB1), which is not compatible with TDB2. This means that you cannot use a dataset created by ROBOT with a program that expects TDB2, and you cannot use an existing TDB2 dataset with ROBOT.
You can also choose to just create a TDB directory without running a query using the --create-tdb
option. This is useful for workflows were a TDB directory may need to be initiated in one step and queried mulitple times in another.
robot query --input nucleus.ttl --create-tdb true
The file provided for --update
does not exist. Check the path and try again.
The --query
, --select
, and --construct
options require two arguments: a query file and an output file (--query <query> <output>
).
You must specify a query to execute with --query
or --queries
.
The query was not able to be parsed. Often, this is as a result of an undefined prefix in the query. See the error message for more details.
Each SPARQL query should be a SELECT, ASK, DESCRIBE, or CONSTRUCT.